- Registered on: 12/08/2017
- Last connection: 11/20/2019
- 09:01 PM GROMACS Revision d5072ffa: Heuristics for switching between CUDA spread/gather kernels
- The various CUDA spread/gather kernels perform better in different circumstances,
so heuristics are used to control w...
- 11:50 AM GROMACS Task #3212 (New): Update regression tests for new kernel flavours
- We now have 4 potential code paths through the spread and gather. Once the tuning is done they will be automatically ...
- 08:50 PM GROMACS Task #3189: implement heuristics for switching between different spread/gather kernel layouts
- Szilárd Páll wrote:
> Correction: actually, the number of atoms is more relevant, though heuristics might still ...
- 04:27 PM GROMACS Task #3189: implement heuristics for switching between different spread/gather kernel layouts
- Ok did an initial version of this, where we are just keying off the number of atoms https://gerrit.gromacs.org/#/c/gr...
- 09:18 PM GROMACS Task #3188: re-enalble parallel spline calculation for #threads/atoms > 4
- I had a better look at what PME_PARALLEL_SPLINE does and it does 3 things.
Uses 12 threads instead of 3 for the fi...
- 01:21 PM GROMACS Revision 22118220: Update PME CUDA spread/gather
- Adds addtional templated kernels to the CUDA spread and
gather kernels. Allowing the use of 4 threads per atom instea...
- 11:33 AM GROMACS Task #3187 (New): Template updated PME kernels using threads per atom
- Currently we are templating using a bool to control between order (4) threads per atom and order*order(16) threads pe...
- 11:23 AM GROMACS Task #3186 (In Progress): Update Constant/Variable naming in the PME GPU kernels.
- 11:22 AM GROMACS Task #3186 (Closed): Update Constant/Variable naming in the PME GPU kernels.
- 11:21 AM GROMACS Task #3186 (In Progress): Update Constant/Variable naming in the PME GPU kernels.
- Additional variables and constants were required to support the 4 threads per atom implementation. A new naming schem...
Also available in: Atom