docs/reference-manual/algorithms/energy-minimization.rst

   1 .. _em:
   2
   3 Energy Minimization
   4 -------------------
   5
   6 Energy minimization in |Gromacs| can be done using steepest descent,
   7 conjugate gradients, or l-bfgs (limited-memory
   8 Broyden-Fletcher-Goldfarb-Shanno quasi-Newtonian minimizer...we prefer
   9 the abbreviation). EM is just an option of the :ref:`mdrun <gmx mdrun>` program.
  10
  11 Steepest Descent
  12 ~~~~~~~~~~~~~~~~
  13
  14 Although steepest descent is certainly not the most efficient algorithm
  15 for searching, it is robust and easy to implement.
  16
  17 We define the vector :math:`\mathbf{r}` as the vector of
  18 all :math:`3N` coordinates. Initially a maximum displacement :math:`h_0`
  19 (*e.g.* 0.01 nm) must be given.
  20
  21 First the forces :math:`\mathbf{F}` and potential energy
  22 are calculated. New positions are calculated by
  23
  24   .. math:: \mathbf{r}_{n+1} =  \mathbf{r}_n + \frac{\mathbf{F}_n}{\max (|\mathbf{F}_n|)} h_n,
  25             :label: eqnEMpos
  26
  27 where :math:`h_n` is the maximum displacement and
  28 :math:`\mathbf{F}_n` is the force, or the negative
  29 gradient of the potential :math:`V`. The notation :math:`\max
  30 (|\mathbf{F}_n|)` means the largest scalar force on any
  31 atom. The forces and energy are again computed for the new positions
  32
  33 | If (:math:`V_{n+1} < V_n`) the new positions are accepted and
  34   :math:`h_{n+1} = 1.2
  35   h_n`.
  36 | If (:math:`V_{n+1} \geq V_n`) the new positions are rejected and
  37   :math:`h_n = 0.2 h_n`.
  38
  39 The algorithm stops when either a user-specified number of force
  40 evaluations has been performed (*e.g.* 100), or when the maximum of the
  41 absolute values of the force (gradient) components is smaller than a
  42 specified value :math:`\epsilon`. Since force truncation produces some
  43 noise in the energy evaluation, the stopping criterion should not be
  44 made too tight to avoid endless iterations. A reasonable value for
  45 :math:`\epsilon` can be estimated from the root mean square force
  46 :math:`f` a harmonic oscillator would exhibit at a temperature
  47 :math:`T`. This value is
  48
  49 .. math:: f = 2 \pi \nu \sqrt{ 2mkT},
  50           :label: eqnEMharmosc
  51
  52 where :math:`\nu` is the oscillator frequency, :math:`m` the (reduced)
  53 mass, and :math:`k` Boltzmann’s constant. For a weak oscillator with a
  54 wave number of 100 cm\ :math:`^{-1}` and a mass of 10 atomic units, at a
  55 temperature of 1 K, :math:`f=7.7` kJ mol\ :math:`^{-1}` nm:math:`^{-1}`.
  56 A value for :math:`\epsilon` between 1 and 10 is acceptable.
  57
  58 Conjugate Gradient
  59 ~~~~~~~~~~~~~~~~~~
  60
  61 Conjugate gradient is slower than steepest descent in the early stages
  62 of the minimization, but becomes more efficient closer to the energy
  63 minimum. The parameters and stop criterion are the same as for steepest
  64 descent. In |Gromacs| conjugate gradient can not be used with constraints,
  65 including the SETTLE algorithm for water \ :ref:`47 <refMiyamoto92>`, as
  66 this has not been implemented. If water is present it must be of a
  67 flexible model, which can be specified in the :ref:`mdp` file
  68 by ``define = -DFLEXIBLE``.
  69
  70 This is not really a restriction, since the accuracy of conjugate
  71 gradient is only required for minimization prior to a normal-mode
  72 analysis, which cannot be performed with constraints. For most other
  73 purposes steepest descent is efficient enough.
  74
  75 L-BFGS
  76 ~~~~~~
  77
  78 The original BFGS algorithm works by successively creating better
  79 approximations of the inverse Hessian matrix, and moving the system to
  80 the currently estimated minimum. The memory requirements for this are
  81 proportional to the square of the number of particles, so it is not
  82 practical for large systems like biomolecules. Instead, we use the
  83 L-BFGS algorithm of Nocedal  \ :ref:`52 <refByrd95a>`,
  84 :ref:`53 <refZhu97a>`, which approximates the inverse Hessian by a fixed number
  85 of corrections from previous steps. This sliding-window technique is
  86 almost as efficient as the original method, but the memory requirements
  87 are much lower - proportional to the number of particles multiplied with
  88 the correction steps. In practice we have found it to converge faster
  89 than conjugate gradients, but due to the correction steps it is not yet
  90 parallelized. It is also noteworthy that switched or shifted
  91 interactions usually improve the convergence, since sharp cut-offs mean
  92 the potential function at the current coordinates is slightly different
  93 from the previous steps used to build the inverse Hessian approximation.