Bug #292

pme problem in 4.0.3 results in memory allocation crash

Added by David Mobley over 11 years ago. Updated over 11 years ago.

Erik Lindahl
Target version:
Affected version - extra info:
Affected version:


Created an attachment (id=350)
Tarball to reproduce problem

I have run into a bug evaluating energies of a single simulation snapshot. Energy evaluation works fine in Gromacs 3.3.1 and in 4.0.3 without PME, and also with PME on and with the partial charges on my small molecule turned off. But with the partial charges turned on, mdrun crashes instantly with the following error:

Program dmdrun, VERSION 4.0.3
Source code file: fftgrid.c, line: 78

Fatal error:
Failed to allocated 578699296 bytes of aligned memory.

It seems to be a PME problem, since (a) it works fine without PME, and (b) the referenced code is the FFT code. At the same time, there is more to the problem than just PME, since running this works fine with the partial charges on the "Protein" (here a small molecule) turned off.

Tarball to reproduce the bug is attached. Either enclosed *.sh file will result in the described crash.

Note that I compiled with FFTW 2.

bug.tar.gz (24.3 KB) bug.tar.gz Tarball to reproduce problem David Mobley, 02/09/2009 08:29 PM


#1 Updated by David van der Spoel over 11 years ago

grompp crashes because of an error in the topology for me, but mdrun with your topology works (although I do get LINCS warning). This is the result:

@ s0 legend "dVpot/dlambda"
0.000000 7.736252

It could of course be due to FFTW2.

#2 Updated by David van der Spoel over 11 years ago


I find in your input file

nkx                  = 416
nky = 416
nkz = 416

Is this correct?
The amount of memory that the program tries to allocate corresponds to 416^3 x 8 bytes.

Interestingly, it crashes with FFTW2, while it hangs with FFTW3.
More later.

#3 Updated by David Mobley over 11 years ago

I didn't intentionally put that in my input file. See fullcoupling.mdp. Concerning PME, I just set:

fourierspacing = 0.1
; FFT grid size, when a value is 0 fourierspacing will be used =
fourier_nx = 0
fourier_ny = 0
fourier_nz = 0
; EWALD/PME/PPPM parameters =
pme_order = 6
ewald_rtol = 1e-06
epsilon_surface = 0
optimize_fft = no

The pme_order, ewald_rtol, and fourierspacing are what I want. I don't even know what nkx, nky, and nkz do...

If there is a problem with the topology let me know and I can fix it.

#4 Updated by Erik Lindahl over 11 years ago

I don't think this is a bug; the memory dimensions are quite correct based on your system and PME settings (spacing 0.1 with all dimensions 40nm will create a 416^3 grid, which in double requires ~580Mb).

This error message simply means the system returns NULL for out-of-memory when we call malloc() with this size, and there isn't a whole lot we can do about that.

FFTW2/FFTW3/Intel MKL might behave slightly differently, both depending on the amount of memory they allocate internally and whether we need an extra scratch array to copy data. We always try to copy in-place, but that is not always possible for all FFT libraries.

Make sure you're compiling in 64 bit mode - otherwise you might be hitting the 2Gb process limit.



#5 Updated by David Mobley over 11 years ago

OK, but this is weird:
- It works fine with exactly the same system as long as the partial charges are turned off on the solute, but not the solvent.
- It works fine in 3.3.x

I don't get how that fits with the "it's just too big a system" scenario.

I guess I can do a workaround -- I'm just trying to reproduce a calculation I did in 3.3.x; if 4.x can't handle it I'll just have to revisit 3.3.x and redo with something 4.x can handle.


#6 Updated by David van der Spoel over 11 years ago

The problem happens when the B array is being allocated (that is because of free energy there are two FFT grids). Is this the difference between having charges off and on, that when you rum them on you turn on free enrgy as well?
Nevertheless, the 2Gb limit is per block of memory even on 32 bit machine, as far as I know, so it shouldn't be a problem. Both machines I used for testing had 4 Gb RAM, one was an Apple (32 bit) the other an x86_64 Linux box.

#7 Updated by David Mobley over 11 years ago

The B array should be allocated in all cases -- in all cases I have free_energy on and there is a different B state; it is just that in the cases that work, the B state has modified charges on the solute, and in the cases that don't work (in 4.0.3), the A and B states have zero charges on the solute and it is the LJ interactions that are being modified.

I can compile with fftw3, I suppose, and see if that helps.

I'm on an x86_64 linux box.

#8 Updated by Berk Hess over 11 years ago

PME is only aware of free energy when charges are changed.

I assume the error is simply due to a too large memory block
request and the error is handled correctly, or not?

You probably didn't intend to use a box of more than 40 nm.

Can we close this bug?


#9 Updated by David Mobley over 11 years ago

Well, the problem did go away when I recompiled with fftw3.

I did actually intend to use a box that large. I was trying to minimize the PME contribution to the total energy for a test, but at the same time still use PME.

It is puzzling to me that, if this is not a bug, I can get away with doing this in (a) 3.3.x with FFTW2, (b) 4.0.x with FFTW3, but not (c) 4.0.x with FFTW2.

Anyway, I'm done with this, but I still think there may be some underlying problem here.

#10 Updated by Berk Hess over 11 years ago

I agree that it is somewhat strange that this depends on the fftw version.
But if the problem is running out of memory, it could simply be that
you are close to the limit and fftw3 is more memory efficient than fftw2.

But Gromacs seems to give a clear error message.
So I would say there are no Gromacs issues here.


#11 Updated by Berk Hess over 11 years ago

This is not a Gromacs bug.


Also available in: Atom PDF