Relax OpenCL gather kernel barrier on AMD

author Szilárd Páll <pall.szilard@gmail.com>

Tue, 9 Oct 2018 00:56:23 +0000 (02:56 +0200)

committer Berk Hess <hess@kth.se>

Tue, 5 Mar 2019 19:56:37 +0000 (20:56 +0100)
author Szilárd Páll <pall.szilard@gmail.com>
Tue, 9 Oct 2018 00:56:23 +0000 (02:56 +0200)
committer Berk Hess <hess@kth.se>
Tue, 5 Mar 2019 19:56:37 +0000 (20:56 +0100)
diff --git a/src/gromacs/ewald/pme-gather.clh b/src/gromacs/ewald/pme-gather.clh

index e795d4dd7086f7862b46327cd04a61481a17cb1f..9061701fcc73a203290e85809d543d346aad69ec 100644 (file)
--- a/src/gromacs/ewald/pme-gather.clh
+++ b/src/gromacs/ewald/pme-gather.clh
@@ -1,7 +1,7 @@
  /*
   * This file is part of the GROMACS molecular simulation package.
   *
- * Copyright (c) 2018, by the GROMACS development team, led by
+ * Copyright (c) 2018,2019, by the GROMACS development team, led by
   * Mark Abraham, David van der Spoel, Berk Hess, and Erik Lindahl,
   * and including many others, as listed in the AUTHORS file in the
   * top-level source directory and at http://www.gromacs.org.
@@ -122,11 +122,15 @@ inline void reduce_atom_forces(__local float * __restrict__  sm_forces,
          int elementIndex = smemReserved + lineIndex;
          // Store input force contributions
          sm_forceReduction[elementIndex] = (dimIndex == XX) ? fx : (dimIndex == YY) ? fy : fz;
-        /* This barrier was not needed in CUDA. Different OpenCL compilers might have different ideas
+
+#if !defined(_AMD_SOURCE_)
+        /* This barrier was not needed in CUDA, nor is it needed on AMD GPUs.
+         * Different OpenCL compilers might have different ideas
           * about #pragma unroll, though. OpenCL 2 has _attribute__((opencl_unroll_hint)).
           * #2519
           */
          barrier(CLK_LOCAL_MEM_FENCE);
+#endif
  
          // Reduce to fit into smemPerDim (warp size)
  #pragma unroll
author	Szilárd Páll <pall.szilard@gmail.com>
	Tue, 9 Oct 2018 00:56:23 +0000 (02:56 +0200)
committer	Berk Hess <hess@kth.se>
	Tue, 5 Mar 2019 19:56:37 +0000 (20:56 +0100)