<li><a HREF="#xmdrun"><b>shell molecular dynamics</b></a>(emtol,niter,fcstep)
<li><a HREF="#tpi"><b>test particle insertion</b></a>(rtpi)
<li><A HREF="#out"><b>output control</b></A> (nstxout, nstvout, nstfout, nstlog, nstcalcenergy, nstenergy, nstxtcout, xtc-precision, xtc-grps, energygrps)
-<li><A HREF="#nl"><b>neighbor searching</b></A> (nstlist, ns-type, pbc, periodic-molecules, rlist, rlistlong)
+<li><A HREF="#nl"><b>neighbor searching</b></A> (cutoff-scheme, nstlist, ns-type, pbc, periodic-molecules, verlet-buffer-drift, rlist, rlistlong)
<li><A HREF="#el"><b>electrostatics</b></A> (coulombtype, rcoulomb-switch, rcoulomb, epsilon-r, epsilon-rf)
<li><A HREF="#vdw"><b>VdW</b></A> (vdwtype, rvdw-switch, rvdw, DispCorr)
<li><A HREF="#table"><b>tables</b></A> (table-extension, energygrp-table)
<dt><b>None</b></dt>
<dd>No restriction on the center of mass motion
</dl></dd>
-<dt><b>nstcomm: (10) [steps]</b></dt>
+<dt><b>nstcomm: (100) [steps]</b></dt>
<dd>frequency for center of mass motion removal</dd>
<dt><b>comm-grps:</b></dt>
<dd>group(s) for center of mass motion removal, default is the whole system</dd>
<dt><b>nstlog: (1000) [steps]</b></dt>
<dd>frequency to write energies to <!--Idx-->log file<!--EIdx-->,
the last energies are always written</dd>
-<dt><b>nstcalcenergy: (-1)</b></dt>
+<dt><b>nstcalcenergy: (100)</b></dt>
<dd>frequency for calculating the energies, 0 is never.
This option is only relevant with dynamics.
With a twin-range cut-off setup <b>nstcalcenergy</b> should be equal to
This option affects the performance in parallel simulations,
because calculating energies requires global communication between all
processes which can become a bottleneck at high parallelization.
-The default value of -1 sets <b>nstcalcenergy</b> equal to <b>nstlist</b>,
-unless <b>nstlist</b> ≤0, then a value of 10 is used.
-If <b>nstenergy</b> is smaller than the automatically generated value,
-the lowest common denominator of <b>nstenergy</b> and <b>nstlist</b> is used.
</dd>
-<dt><b>nstenergy: (100) [steps]</b></dt>
+<dt><b>nstenergy: (1000) [steps]</b></dt>
<dd>frequency to write energies to energy file,
the last energies are always written,
should be a multiple of <b>nstcalcenergy</b>.
<hr>
<h3>Neighbor searching<!--QuietIdx-->neighbor searching<!--EQuietIdx--></h3>
<dl>
+<dt><b>cutoff-scheme:</b></dt>
+<dd><dl compact>
+<dt><b>group</b></dt>
+<dd>Generate a pair list for groups of atoms. These groups correspond to the
+charge groups in the topology. This was the only cut-off treatment scheme
+before version 4.6.
+There is no explicit buffering of the pair list. This enables efficient force
+calculations, but energy is only conserved when a buffer is explicitly added.
+For energy conservation, the <b>Verlet</b> option provides a more convenient
+and efficient algorithm.</dd>
+
+<dt><b>Verlet</b></dt>
+<dd>Generate a pair list with buffering. The buffer size is automatically set
+based on <b>verlet-buffer-drift</b>, unless this is set to -1, in which case
+<b>rlist</b> will be used. This option has an explicit, exact cut-off at
+<b>rvdw</b>=<b>rcoulomb</b>. Currently only cut-off, reaction-field,
+PME electrostatics and plain LJ are supported. Some <tt>mdrun</tt> functionality
+is not yet supported with the <b>Verlet</b> scheme, but <tt>grompp</tt> checks for this.
+Native GPU acceleration is only supported with <b>Verlet</b>. With GPU-accelerated PME,
+<tt>mdrun</tt> will automatically tune the CPU/GPU load balance by
+scaling <b>rcoulomb</b> and the grid spacing. This can be turned off with
+<tt>-notunepme</tt>.
+
+<b>Verlet<\b> is somewhat faster than <b>group</b> when there is no water, or if <b>group</b> would use a pair-list buffer to conserve energy.
+</dd>
+</dl></dd>
+
<dt><b>nstlist: (10) [steps]</b></dt>
<dd><dl compact>
<dt><b>>0</b></dt>
the long-range forces, when using twin-range cut-offs). When this is 0,
the neighbor list is made only once.
With energy minimization the neighborlist will be updated for every
-energy evaluation when <b>nstlist</b><tt>>0</tt>.</dd>
+energy evaluation when <b>nstlist</b><tt>>0</tt>.
+With non-bonded force calculation on the GPU, a value of 20 or more gives
+the best performance.</dd>
<dt><b>0</b></dt>
<dd>The neighbor list is only constructed once and never updated.
This is mainly useful for vacuum simulations in which all particles
see each other.</dd>
<dt><b>-1</b></dt>
-<dd>Automated update frequency.
+<dd>Automated update frequency, only supported with <b>cutoff-scheme</b>=<b>group</b>.
This can only be used with switched, shifted or user potentials where
the cut-off can be smaller than <b>rlist</b>. One then has a buffer
of size <b>rlist</b> minus the longest cut-off.
and molecules are not made whole in the output</dd>
</dl></dd>
-<dt><b>rlist: (-1) [nm]</b></dt>
-<dd>cut-off distance for the short-range neighbor list, should be ≥ 0</dd>
+<dt><b>verlet-buffer-drift: (0.005) [kJ/mol/ps]</b></dt>
+<dd>Useful only with <b>cutoff-scheme</b>=<b>Verlet</b>. This sets the target energy drift
+per particle caused by the Verlet buffer, which indirectly sets <b>rlist</b>.
+As both <b>nstlist</b> and the Verlet buffer size are fixed
+(for performance reasons), particle pairs not in the pair list can occasionally
+get within the cut-off distance during <b>nstlist</b>-1 nsteps. This
+generates energy drift. In a constant-temperature ensemble, the drift can be
+estimated for a given cut-off and <b>rlist</b>. The estimate assumes a
+homogeneous particle distribution, hence the drift might be slightly
+underestimated for multi-phase systems. For longer pair-list life-time
+(<b>nstlist</b>-1)*dt the drift is overestimated, because the interactions
+between particles are ignored. Combined with cancellation of errors,
+the actual energy drift is usually one to two orders of magnitude smaller.
+Note that the generated buffer size takes into account that
+the GROMACS pair-list setup leads to a reduction in the drift by
+a factor 10, compared to a simple particle-pair based list.
+Without dynamics (energy minimization etc.), the buffer is 5% of the cut-off.
+For dynamics without temperature coupling or to override the buffer size,
+use <b>verlet-buffer-drift</b>=-1 and set <b>rlist</b> manually.</dd>
+
+<dt><b>rlist: (1) [nm]</b></dt>
+<dd>Cut-off distance for the short-range neighbor list, should be ≥ 0.
+With <b>cutoff-scheme</b>=<b>Verlet</b>, this is by default set by the
+<b>verlet-buffer-drift</b> option and the value of <b>rlist</b> is ignored.</dd>
<dt><b>rlistlong: (-1) [nm]</b></dt>
<dd>Cut-off distance for the long-range neighbor list.
<b><A HREF="#tc">ref-t</A></b> [K].</dd>
<dt><b>Reaction-Field-zero</b></dt>
-<dd>In GROMACS normal reaction-field electrostatics leads to bad
+<dd>In GROMACS, normal reaction-field electrostatics with
+<b>cutoff-scheme</b><b>=group</b> leads to bad
energy conservation. <b>Reaction-Field-zero</b> solves this
by making the potential zero beyond the cut-off. It can only
be used with an infinite dielectric constant (<b>epsilon-rf=0</b>),
<dt><b>PME-Switch</b></dt>
<dd>A combination of PME and a switch function for the direct-space part
(see above). <b>rcoulomb</b> is allowed to be smaller than <b>rlist</b>.
-This is mainly useful constant energy simulations. For constant temperature
-simulations the advantage of improved energy conservation
-is usually outweighed by the small loss in accuracy of the electrostatics.
+This is mainly useful constant energy simulations (note that using
+<b>PME</b> with <b>cutoff-scheme</b>=<b>Verlet</b> will be more efficient).
</dd>
<dt><b>PME-User</b></dt>
<A HREF="#free">couple-lambda0</A><br>
<A HREF="#free">couple-lambda1</A><br>
<A HREF="#free">couple-moltype</A><br>
+<A HREF="#nl">cutoff-scheme</A><br>
<A HREF="#pp">define</A><br>
<A HREF="#neq">deform</A><br>
<A HREF="#free">delta-lambda</A><br>
<A HREF="#user">userreal3</A><br>
<A HREF="#user">userreal4</A><br>
<A HREF="#el">vdwtype</A><br>
+<A HREF="#nl">verlet-buffer-drift</A><br>
<A HREF="#out">xtc-grps</A><br>
<A HREF="#out">xtc-precision</A><br>
<A HREF="#sa">zero-temp-time</A><br>