disable the use of the lower-latency cudaLaunchKernel API even when supported (CUDA >=v7.0).
Should only be used for benchmarking purposes.
+``GMX_DISABLE_CUDA_TIMING``
+ Disables GPU timing of CUDA tasks; synonymous with ``GMX_DISABLE_GPU_TIMING``.
+
``GMX_CYCLE_ALL``
times all code during runs. Incompatible with threads.
disables architecture-specific SIMD-optimized (SSE2, SSE4.1, AVX, etc.)
non-bonded kernels thus forcing the use of plain C kernels.
-``GMX_DISABLE_CUDA_TIMING``
+``GMX_DISABLE_GPU_TIMING``
timing of asynchronously executed GPU operations can have a
non-negligible overhead with short step times. Disabling timing can improve performance in these cases.
/* CUDA timing disabled as event timers don't work:
- with multiple streams = domain-decomposition;
- - when turned off by GMX_DISABLE_CUDA_TIMING.
+ - when turned off by GMX_DISABLE_CUDA_TIMING/GMX_DISABLE_GPU_TIMING.
*/
nb->bDoTime = (!nb->bUseTwoStreams &&
- (getenv("GMX_DISABLE_CUDA_TIMING") == NULL));
+ (getenv("GMX_DISABLE_CUDA_TIMING") == NULL) &&
+ (getenv("GMX_DISABLE_GPU_TIMING") == NULL));
if (nb->bDoTime)
{
init_plist(nb->plist[eintLocal]);
/* OpenCL timing disabled if GMX_DISABLE_OCL_TIMING is defined. */
- nb->bDoTime = (getenv("GMX_DISABLE_OCL_TIMING") == NULL);
+ /* TODO deprecate the first env var in the 2017 release. */
+ nb->bDoTime = (getenv("GMX_DISABLE_OCL_TIMING") == NULL &&
+ getenv("GMX_DISABLE_GPU_TIMING") == NULL);
/* Create queues only after bDoTime has been initialized */
if (nb->bDoTime)