Added to PME on GPU information in mdrun performance

author Kevin Boyd <kevin.boyd@uconn.edu>

Thu, 24 May 2018 02:44:28 +0000 (22:44 -0400)

committer Erik Lindahl <erik.lindahl@gmail.com>

Wed, 6 Jun 2018 10:53:32 +0000 (12:53 +0200)
author Kevin Boyd <kevin.boyd@uconn.edu>
Thu, 24 May 2018 02:44:28 +0000 (22:44 -0400)
committer Erik Lindahl <erik.lindahl@gmail.com>
Wed, 6 Jun 2018 10:53:32 +0000 (12:53 +0200)
diff --git a/docs/user-guide/mdrun-performance.rst b/docs/user-guide/mdrun-performance.rst

index 3f99da93e5f38c27d498572b90f9aad2fe6aa85a..a42d0a76ef97ad561907c013314a6fb5461baaad 100644 (file)
--- a/docs/user-guide/mdrun-performance.rst
+++ b/docs/user-guide/mdrun-performance.rst
@@ -243,12 +243,20 @@ behavior.
      are no separate PME ranks.
  
  ``-nb``
-    Used to set where to execute the non-bonded interactions.
+    Used to set where to execute the short-range non-bonded interactions.
      Can be set to "auto", "cpu", "gpu."
      Defaults to "auto," which uses a compatible GPU if available.
      Setting "cpu" requires that no GPU is used. Setting "gpu" requires
      that a compatible GPU be available and will be used.
  
+``-pme``
+    Used to set where to execute the long-range non-bonded interactions.
+    Can be set to "auto", "cpu", "gpu."
+    Defaults to "auto," which uses a compatible GPU if available.
+    Setting "gpu" requires that a compatible GPU be available and will be used.
+    Multiple PME ranks are not supported with PME on GPU, so if a GPU is used
+    for the PME calculation -npme must be set to 1.
+
  ``-gpu_id``
      A string that specifies the ID numbers of the GPUs that
      are available to be used by ranks on this node. For example,
@@ -344,6 +352,38 @@ The number of ranks must be a multiple of the number of
  sockets, and the number of cores per node must be
  a multiple of the number of threads per rank.
  
+::
+
+    gmx mdrun -ntmpi 4 -nb gpu -pme cpu -gputasks 0011
+
+Starts :ref:`mdrun <gmx mdrun>` using four thread-MPI ranks, and maps them
+to GPUs with IDs 0 and 1. The CPU cores available will be split evenly between
+the ranks using OpenMP threads, with the first two ranks offloading short-range
+nonbonded force calculations to GPU 0, and the last two ranks offloading to GPU 1.
+The long-range component of the forces are calculated on CPUs. This may be optimal
+on hardware where the CPUs are relatively powerful compared to the GPUs.
+
+::
+
+    gmx mdrun -ntmpi 4 -nb gpu -pme gpu -npme 1 -gputasks 0001
+
+Starts :ref:`mdrun <gmx mdrun>` using four thread-MPI ranks, one of which is
+dedicated to the long-range PME calculation. The first 3 threads offload their
+short-range non-bonded calculations to the GPU with ID 0, the 4th (PME) thread
+offloads its calculations to the GPU with ID 1.
+
+::
+
+    gmx mdrun -ntmpi 4 -nb gpu -pme gpu -npme 1 -gputasks 0011
+
+Similar to the above example, with 3 ranks assigned to calculating short-range
+non-bonded forces, and one rank assigned to calculate the long-range forces.
+In this case, 2 of the 3 short-range ranks offload their nonbonded force
+calculations to GPU 0. The GPU with ID 1 calculates the short-ranged forces of
+the 3rd short-range rank, as well as the long-range forces of the PME-dedicated
+rank. Whether this or the above example is optimal will depend on the capabilities
+of the individual GPUs and the system composition.
+
  ::
  
      gmx mdrun -gpu_id 12
@@ -353,14 +393,6 @@ GPU 0 is dedicated to running a display). This requires
  two thread-MPI ranks, and will split the available
  CPU cores between them using OpenMP threads.
  
-::
-
-    gmx mdrun -ntmpi 4 -nb gpu -gputasks 1122
-
-Starts :ref:`mdrun <gmx mdrun>` using four thread-MPI ranks, and maps them
-to GPUs with IDs 1 and 2. The CPU cores available will
-be split evenly between the ranks using OpenMP threads.
-
  ::
  
      gmx mdrun -nt 6 -pin on -pinoffset 0 -pinstride 1
author	Kevin Boyd <kevin.boyd@uconn.edu>
	Thu, 24 May 2018 02:44:28 +0000 (22:44 -0400)
committer	Erik Lindahl <erik.lindahl@gmail.com>
	Wed, 6 Jun 2018 10:53:32 +0000 (12:53 +0200)