docs/user-guide/cutoff-schemes.rst

   1 Non-bonded cut-off schemes
   2 ==========================
   3
   4 The default cut-off scheme in |Gromacs| |version| is based on classical
   5 buffered Verlet lists. These are implemented extremely efficiently
   6 on modern CPUs and accelerators, and support nearly all of the
   7 algorithms used in |Gromacs|.
   8
   9 Before version 4.6, |Gromacs| always used pair-lists based on groups of
  10 particles. These groups of particles were originally charge-groups, which were
  11 necessary with plain cut-off electrostatics. With the use of PME (or
  12 reaction-field with a buffer), charge groups are no longer necessary
  13 (and are ignored in the Verlet scheme). In |Gromacs| 4.6 and later, the
  14 group-based cut-off scheme is still available, but is **deprecated in
  15 5.0 and 5.1**. It is still available mainly for backwards
  16 compatibility, to support the algorithms that have not yet been
  17 converted, and for the few cases where it may allow faster simulations
  18 with bio-molecular systems dominated by water.
  19
  20 Without PME, the group cut-off scheme should generally be combined
  21 with a buffered pair-list to help avoid artifacts. However, the
  22 group-scheme kernels that can implement this are much slower than
  23 either the unbuffered group-scheme kernels, or the buffered
  24 Verlet-scheme kernels. Use of the Verlet scheme is strongly encouraged
  25 for all kinds of simulations, because it is easier and faster to run
  26 correctly. In particular, GPU acceleration is available only with the
  27 Verlet scheme.
  28
  29 The Verlet scheme uses properly buffered lists with exact cut-offs.
  30 The size of the buffer is chosen with :mdp:`verlet-buffer-tolerance`
  31 to permit a certain level of drift.  Both the LJ and Coulomb potential
  32 are shifted to zero by subtracting the value at the cut-off. This
  33 ensures that the energy is the integral of the force. Still it is
  34 advisable to have small forces at the cut-off, hence to use PME or
  35 reaction-field with infinite epsilon.
  36
  37 Non-bonded scheme feature comparison
  38 ------------------------------------
  39
  40 All |Gromacs| |version| features not directly related to non-bonded
  41 interactions are supported in both schemes. Eventually, all non-bonded
  42 features will be supported in the Verlet scheme. A table describing
  43 the compatibility of just non-bonded features with the two schemes is
  44 given below.
  45
  46 Table: Support levels within the group and Verlet cut-off schemes
  47 for features related to non-bonded interactions
  48
  49 ====================================  ============ =======
  50 Feature                               group        Verlet
  51 ====================================  ============ =======
  52 unbuffered cut-off scheme             default      not by default
  53 exact cut-off                         shift/switch always
  54 potential-shift interactions          yes          yes
  55 potential-switch interactions         yes          yes
  56 force-switch interactions             yes          yes
  57 switched potential                    yes          yes
  58 switched forces                       yes          yes
  59 non-periodic systems                  yes          Z + walls
  60 implicit solvent                      yes          no
  61 free energy perturbed non-bondeds     yes          yes
  62 energy group contributions            yes          only on CPU
  63 energy group exclusions               yes          no
  64 AdResS multi-scale                    yes          no
  65 OpenMP multi-threading                only PME     all
  66 native GPU support                    no           yes
  67 Coulomb PME                           yes          yes
  68 Lennard-Jones PME                     yes          yes
  69 virtual sites                         yes          yes
  70 User-supplied tabulated interactions  yes          no
  71 Buckingham VdW interactions           yes          no
  72 rcoulomb != rvdw                      yes          no
  73 twin-range                            yes          no
  74 ====================================  ============ =======
  75
  76 Performance
  77 -----------
  78
  79 The performance of the group cut-off scheme depends very much on the
  80 composition of the system and the use of buffering. There are
  81 optimized kernels for interactions with water, so anything with a lot
  82 of water runs very fast. But if you want properly buffered
  83 interactions, you need to add a buffer that takes into account both
  84 charge-group size and diffusion, and check each interaction against
  85 the cut-off length each time step. This makes simulations much
  86 slower. The performance of the Verlet scheme with the new non-bonded
  87 kernels is independent of system composition and is intended to always
  88 run with a buffered pair-list. Typically, buffer size is 0 to 10% of
  89 the cut-off, so you could win a bit of performance by reducing or
  90 removing the buffer, but this might not be a good trade-off of
  91 simulation quality.
  92
  93 The table below shows a performance comparison of most of the relevant
  94 setups. Any atomistic model will have performance comparable to tips3p
  95 (which has LJ on the hydrogens), unless a united-atom force field is
  96 used. The performance of a protein in water will be between the tip3p
  97 and tips3p performance. The group scheme is optimized for water
  98 interactions, which means a single charge group containing one particle
  99 with LJ, and 2 or 3 particles without LJ. Such kernels for water are
 100 roughly twice as fast as a comparable system with LJ and/or without
 101 charge groups. The implementation of the Verlet cut-off scheme has no
 102 interaction-specific optimizations, except for only calculating half
 103 of the LJ interactions if less than half of the particles have LJ. For
 104 molecules solvated in water the scaling of the Verlet scheme to higher
 105 numbers of cores is better than that of the group scheme, because the
 106 load is more balanced. On the most recent Intel CPUs, the absolute
 107 performance of the Verlet scheme exceeds that of the group scheme,
 108 even for water-only systems.
 109
 110 Table: Performance in ns/day of various water systems under different
 111 non-bonded setups in |Gromacs| using either 8 thread-MPI ranks (group
 112 scheme), or 8 OpenMP threads (Verlet scheme). 3000 particles, 1.0 nm
 113 cut-off, PME with 0.11 nm grid, dt=2 fs, Intel Core i7 2600 (AVX), 3.4
 114 GHz + Nvidia GTX660Ti
 115
 116 ========================  =================  ===============  ================  =====================
 117 system                    group, unbuffered  group, buffered  Verlet, buffered  Verlet, buffered, GPU
 118 ========================  =================  ===============  ================  =====================
 119 tip3p, charge groups      208                116              170               450
 120 tips3p, charge groups     129                63               162               450
 121 tips3p, no charge groups  104                75               162               450
 122 ========================  =================  ===============  ================  =====================
 123
 124 How to use the Verlet scheme
 125 ----------------------------
 126
 127 The Verlet scheme is enabled by default with option :mdp:`cutoff-scheme`.
 128 The value of [.mdp] option :mdp:`verlet-buffer-tolerance` will add a
 129 pair-list buffer whose size is tuned for the given energy drift (in
 130 kJ/mol/ns per particle). The effective drift is usually much lower, as
 131 :ref:`gmx grompp` assumes constant particle velocities. (Note that in single
 132 precision for normal atomistic simulations constraints cause a drift
 133 somewhere around 0.0001 kJ/mol/ns per particle, so it doesn't make sense
 134 to go much lower.) Details on how the buffer size is chosen can be
 135 found in the reference below and in the Reference Manual.
 136
 137 For constant-energy (NVE) simulations, the buffer size will be
 138 inferred from the temperature that corresponds to the velocities
 139 (either those generated, if applicable, or those found in the input
 140 configuration). Alternatively, :mdp:`verlet-buffer-tolerance` can be set
 141 to -1 and a buffer set manually by specifying :mdp:`rlist` greater than
 142 the larger of :mdp:`rcoulomb` and :mdp:`rvdw`. The simplest way to get a
 143 reasonable buffer size is to use an NVT mdp file with the target
 144 temperature set to what you expect in your NVE simulation, and
 145 transfer the buffer size printed by grompp to your NVE [.mdp] file.
 146
 147 When a GPU is used, nstlist is automatically increased by mdrun,
 148 usually to 20 or more; rlist is increased along to stay below the
 149 target energy drift. Further information on [running mdrun with
 150 GPUs] is available.
 151
 152 Further information
 153 -------------------
 154
 155 For further information on algorithmic and implementation details of
 156 the Verlet cut-off scheme and the MxN kernels, as well as detailed
 157 performance analysis, please consult the following article:
 158
 159 Páll, S. and Hess, B. A flexible algorithm for calculating pair
 160 interactions on SIMD architectures. Comput. Phys. Commun. 184,
 161 2641–2650 (2013). <http://dx.doi.org/10.1016/j.cpc.2013.06.003>