allow compilation to optimize for CUDA compute cap. 3.5