CUDA PME kernels with analytical Ewald correction
The analytical Ewald kernels have been used in the CPU SIMD kernels, but
due to CUDA compiler issues it has been difficult to determine in which
cases does this provide a performance advantage compared to the
tabulated kernels.Although the nvcc optimizations are rather unreliable,
on Kepler (SM 3.x) the analytical Ewald kernels are up to 5% faster, but
on Fermi (SM 2.x) 7% slower than the tabulated. Hence, this commit
enables the analytical kernels as default for Kepler GPUs, but keeps the
tabulated kernels as default on Fermi.
Note that the analytical Ewald correction is not implemented in the
legacy kernels as these are anyway only used on Fermi.
Additional minor change is the back-port of some variable (re)naming and
simple optimizations from the default to the legacy CUDA kernels which
give 2-3% performance improvement and better code readability.
Change-Id: Idd4659ef3805609356fe8865dc57fd19b0b614fe