allow compilation to optimize for CUDA compute cap. 3.5
authorSzilard Pall <pall.szilard@gmail.com>
Thu, 26 Sep 2013 10:45:55 +0000 (12:45 +0200)
committerGerrit Code Review <gerrit@gerrit.gromacs.org>
Sun, 29 Sep 2013 21:39:17 +0000 (23:39 +0200)
commitcd01238b6b0eca4ddf115efc3abda44e98eabe6d
tree70c75d1795d89ad2f8811759c764739faab06b20
parentc839c0e9be3d39ab4ac84fcbc377ded1ecddf9b8
allow compilation to optimize for CUDA compute cap. 3.5

Enabling optimizations targeting compute capability 3.5 devices
(GK110) slightly improves performance of both PME and RF kernels.
This requires a hint for the compiler optimization indicating
the maximum number of threads/block and minimum number of
blocks/multiprocessor. This change allows nvcc >=5.0 to generate
code for CC 3.5 devices and switches to including PTX 3.5 code
(instead of 3.0) in the binary.

Change-Id: If7e14d31165bc05859250db7468bf6bd8c186264
cmake/gmxManageNvccConfig.cmake
src/mdlib/nbnxn_cuda/nbnxn_cuda_kernel.cuh
src/mdlib/nbnxn_cuda/nbnxn_cuda_kernel_legacy.cuh