Request flushing denorms to zero in OpenCL

author Szilárd Páll <pall.szilard@gmail.com>

Mon, 30 Jul 2018 16:18:37 +0000 (18:18 +0200)

committer Szilárd Páll <pall.szilard@gmail.com>

Tue, 21 Aug 2018 14:54:42 +0000 (16:54 +0200)
author Szilárd Páll <pall.szilard@gmail.com>
Mon, 30 Jul 2018 16:18:37 +0000 (18:18 +0200)
committer Szilárd Páll <pall.szilard@gmail.com>
Tue, 21 Aug 2018 14:54:42 +0000 (16:54 +0200)
diff --git a/docs/release-notes/2018/2018.3.rst b/docs/release-notes/2018/2018.3.rst

index 58d8bef243f97710d0b1ae7c6beb2c12763fe675..2bda2a55c8c28a8fcf5fb355b8bc9c1a4a1e7260 100644 (file)
--- a/docs/release-notes/2018/2018.3.rst
+++ b/docs/release-notes/2018/2018.3.rst
@@ -82,3 +82,12 @@ Fixes to improve portability
  
  Miscellaneous
  ^^^^^^^^^^^^^
+
+Improve OpenCL kernel performance on AMD Vega GPUs
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The OpenCL kernel optimization flags did not explicitly turn off denorm handling
+which could lead to performance loss. The optimization is now explicitly turned
+on both for consistency with CUDA and performance reasons.
+On AMD Vega GPUs (with ROCm) kernel performance improves by up to 30%.
+
+
diff --git a/src/gromacs/gpu_utils/ocl_compiler.cpp b/src/gromacs/gpu_utils/ocl_compiler.cpp

index 957c6bc45387d0832fc20d850e6a6addb47cefab..f54a94fa42dca39c6f4e2683933c4da7df178659 100644 (file)
--- a/src/gromacs/gpu_utils/ocl_compiler.cpp
+++ b/src/gromacs/gpu_utils/ocl_compiler.cpp
@@ -179,6 +179,11 @@ selectCompilerOptions(ocl_vendor_id_t deviceVendorId)
      if (getenv("GMX_OCL_DISABLE_FASTMATH") == NULL)
      {
          compilerOptions += " -cl-fast-relaxed-math";
+
+        // Hint to the compiler that it can flush denorms to zero.
+        // In CUDA this is triggered by the -use_fast_math flag, equivalent with
+        // -cl-fast-relaxed-math, hence the inclusion on the conditional block.
+        compilerOptions += " -cl-denorms-are-zero";
      }
  
      if ((deviceVendorId == OCL_VENDOR_NVIDIA) && getenv("GMX_OCL_VERBOSE"))
author	Szilárd Páll <pall.szilard@gmail.com>
	Mon, 30 Jul 2018 16:18:37 +0000 (18:18 +0200)
committer	Szilárd Páll <pall.szilard@gmail.com>
	Tue, 21 Aug 2018 14:54:42 +0000 (16:54 +0200)
docs/release-notes/2018/2018.3.rst		patch \| blob \| history
src/gromacs/gpu_utils/ocl_compiler.cpp		patch \| blob \| history