CUDA cut-off kernels now shift exclusion energies
This is to make the GPU kernels consistent with the CPU nbnxn
kernels, which for eeltype=cut-off modifier=potential-shift
effectively do reaction-field with epsilon_rf=1.
Also implemented shifting of the Coulomb potential for the group
cut-off scheme for non-excluded pairs only.
The manual now explains these details.
Also fixed the generic group kernel with an exact cutoff for
either Coulomb or VdW. I think this was supposed to not be supported
but neither grompp nor mdrun checked for this. The fix is trivial.
Change-Id: I48bff73587e43338162f90fa7d526e1909ce5ad1