Fix CUDA architecture dependent issues