Verlet buffer calculated too small with 2-wide SIMD
Due to a missing SIMD include file, the Verlet buffer size for rlist was calculated too small by grompp and mdrun with 2-wide simd. This could lead to larger energy drift due particles moving into range during nstlist steps, but the difference in buffer between 4x2 and 4x4 is small (up to 15%). The architectures with 2-wide SIMD are SSE2, SSE4 and AVX128-FMA (all in double precision only) and SPARC64.
Fixed Verlet buffer issue with 2-wide SIMD
The Verlet buffer size for CPUs was always calculated for 4x4.
With 2-wide SIMD the estimate should be for 4x2, which results
in a slighly larger list buffer.
grompp now always sets rlist for a 4x4 list setup; mdrun anyhow
redetermines rlist at run time (added a note for this in grompp).