docs/reference-manual/energy-minimization.rst

   1 .. _em:
   2
   3 Energy Minimization
   4 -------------------
   5
   6 Energy minimization in |Gromacs| can be done using steepest descent,
   7 conjugate gradients, or l-bfgs (limited-memory
   8 Broyden-Fletcher-Goldfarb-Shanno quasi-Newtonian minimizer...we prefer
   9 the abbreviation). EM is just an option of the :ref:`mdrun <gmx mdrun>` program.
  10
  11 Steepest Descent
  12 ~~~~~~~~~~~~~~~~
  13
  14 Although steepest descent is certainly not the most efficient algorithm
  15 for searching, it is robust and easy to implement.
  16
  17 We define the vector :math:`\mathbf{r}` as the vector of
  18 all :math:`3N` coordinates. Initially a maximum displacement :math:`h_0`
  19 (*e.g.* 0.01 nm) must be given.
  20
  21 First the forces :math:`\mathbf{F}` and potential energy
  22 are calculated. New positions are calculated by
  23
  24   .. math:: \mathbf{r}_{n+1} =  \mathbf{r}_n + \frac{\mathbf{F}_n}{\max (|\mathbf{F}_n|)} h_n,
  25
  26 where :math:`h_n` is the maximum displacement and
  27 :math:`\mathbf{F}_n` is the force, or the negative
  28 gradient of the potential :math:`V`. The notation :math:`\max
  29 (|\mathbf{F}_n|)` means the largest scalar force on any
  30 atom. The forces and energy are again computed for the new positions
  31
  32 | If (:math:`V_{n+1} < V_n`) the new positions are accepted and
  33   :math:`h_{n+1} = 1.2
  34   h_n`.
  35 | If (:math:`V_{n+1} \geq V_n`) the new positions are rejected and
  36   :math:`h_n = 0.2 h_n`.
  37
  38 The algorithm stops when either a user-specified number of force
  39 evaluations has been performed (*e.g.* 100), or when the maximum of the
  40 absolute values of the force (gradient) components is smaller than a
  41 specified value :math:`\epsilon`. Since force truncation produces some
  42 noise in the energy evaluation, the stopping criterion should not be
  43 made too tight to avoid endless iterations. A reasonable value for
  44 :math:`\epsilon` can be estimated from the root mean square force
  45 :math:`f` a harmonic oscillator would exhibit at a temperature
  46 :math:`T`. This value is
  47
  48 .. math:: f = 2 \pi \nu \sqrt{ 2mkT},
  49
  50 where :math:`\nu` is the oscillator frequency, :math:`m` the (reduced)
  51 mass, and :math:`k` Boltzmann’s constant. For a weak oscillator with a
  52 wave number of 100 cm\ :math:`^{-1}` and a mass of 10 atomic units, at a
  53 temperature of 1 K, :math:`f=7.7` kJ mol\ :math:`^{-1}` nm:math:`^{-1}`.
  54 A value for :math:`\epsilon` between 1 and 10 is acceptable.
  55
  56 Conjugate Gradient
  57 ~~~~~~~~~~~~~~~~~~
  58
  59 Conjugate gradient is slower than steepest descent in the early stages
  60 of the minimization, but becomes more efficient closer to the energy
  61 minimum. The parameters and stop criterion are the same as for steepest
  62 descent. In |Gromacs| conjugate gradient can not be used with constraints,
  63 including the SETTLE algorithm for water \ :ref:`47 <refMiyamoto92>`, as
  64 this has not been implemented. If water is present it must be of a
  65 flexible model, which can be specified in the :ref:`mdp` file
  66 by ``define = -DFLEXIBLE``.
  67
  68 This is not really a restriction, since the accuracy of conjugate
  69 gradient is only required for minimization prior to a normal-mode
  70 analysis, which cannot be performed with constraints. For most other
  71 purposes steepest descent is efficient enough.
  72
  73 L-BFGS
  74 ~~~~~~
  75
  76 The original BFGS algorithm works by successively creating better
  77 approximations of the inverse Hessian matrix, and moving the system to
  78 the currently estimated minimum. The memory requirements for this are
  79 proportional to the square of the number of particles, so it is not
  80 practical for large systems like biomolecules. Instead, we use the
  81 L-BFGS algorithm of Nocedal  \ :ref:`52 <refByrd95a>`,
  82 :ref:`53 <refZhu97a>`, which approximates the inverse Hessian by a fixed number
  83 of corrections from previous steps. This sliding-window technique is
  84 almost as efficient as the original method, but the memory requirements
  85 are much lower - proportional to the number of particles multiplied with
  86 the correction steps. In practice we have found it to converge faster
  87 than conjugate gradients, but due to the correction steps it is not yet
  88 parallelized. It is also noteworthy that switched or shifted
  89 interactions usually improve the convergence, since sharp cut-offs mean
  90 the potential function at the current coordinates is slightly different
  91 from the previous steps used to build the inverse Hessian approximation.