docs/reference-manual/averages.rst

   1 Averages and fluctuations
   2 =========================
   3
   4 Formulae for averaging
   5 ----------------------
   6
   7 **Note:** this section was taken from ref \ :ref:`179 <refGunsteren94a>`.
   8
   9 When analyzing a MD trajectory averages :math:`\left<x\right>` and
  10 fluctuations
  11
  12 .. math::  \left<(\Delta x)^2\right>^{{\frac{1}{2}}} ~=~ \left<[x-\left<x\right>]^2\right>^{{\frac{1}{2}}}
  13            :label: eqnvar0
  14
  15 of a quantity :math:`x` are to be computed. The variance
  16 :math:`\sigma_x` of a series of N\ :math:`_x` values, {x:math:`_i`}, can
  17 be computed from
  18
  19 .. math:: \sigma_x~=~ \sum_{i=1}^{N_x} x_i^2 ~-~  \frac{1}{N_x}\left(\sum_{i=1}^{N_x}x_i\right)^2
  20           :label: eqnvar1
  21
  22 Unfortunately this formula is numerically not very accurate, especially
  23 when :math:`\sigma_x^{{\frac{1}{2}}}` is small compared to the values of
  24 :math:`x_i`. The following (equivalent) expression is numerically more
  25 accurate
  26
  27 .. math:: \sigma_x ~=~ \sum_{i=1}^{N_x} [x_i  - \left<x\right>]^2
  28
  29  with
  30
  31 .. math:: \left<x\right> ~=~ \frac{1}{N_x} \sum_{i=1}^{N_x} x_i
  32           :label: eqnvar2
  33
  34 Using :eq:`eqns. %s <eqnvar1>` and
  35 :eq:`%s <eqnvar2>` one has to go through the series of
  36 :math:`x_i` values twice, once to determine :math:`\left<x\right>` and
  37 again to compute :math:`\sigma_x`, whereas
  38 :eq:`eqn. %s <eqnvar0>` requires only one sequential scan of
  39 the series {x:math:`_i`}. However, one may cast
  40 :eq:`eqn. %s <eqnvar1>` in another form, containing partial
  41 sums, which allows for a sequential update algorithm. Define the partial
  42 sum
  43
  44 .. math:: X_{n,m} ~=~ \sum_{i=n}^{m} x_i
  45
  46 and the partial variance
  47
  48 .. math::  \sigma_{n,m} ~=~ \sum_{i=n}^{m}  \left[x_i - \frac{X_{n,m}}{m-n+1}\right]^2
  49            :label: eqnsigma
  50
  51 It can be shown that
  52
  53 .. math::  X_{n,m+k} ~=~  X_{n,m} + X_{m+1,m+k}
  54            :label: eqnXpartial
  55
  56 and
  57
  58 .. math:: \begin{aligned}
  59           \sigma_{n,m+k} &=& \sigma_{n,m} + \sigma_{m+1,m+k} + \left[~\frac {X_{n,m}}{m-n+1} - \frac{X_{n,m+k}}{m+k-n+1}~\right]^2~* \nonumber\\
  60           && ~\frac{(m-n+1)(m+k-n+1)}{k}
  61           \end{aligned}
  62           :label: eqnvarpartial
  63
  64 For :math:`n=1` one finds
  65
  66 .. math:: \sigma_{1,m+k} ~=~ \sigma_{1,m} + \sigma_{m+1,m+k}~+~
  67           \left[~\frac{X_{1,m}}{m} - \frac{X_{1,m+k}}{m+k}~\right]^2~ \frac{m(m+k)}{k}
  68           :label: eqnsig1
  69
  70 and for :math:`n=1` and :math:`k=1`
  71 (:eq:`eqn. %s <eqnvarpartial>`) becomes
  72
  73 .. math:: \begin{aligned}
  74           \sigma_{1,m+1}  &=& \sigma_{1,m} +
  75           \left[\frac{X_{1,m}}{m} - \frac{X_{1,m+1}}{m+1}\right]^2 m(m+1)\\
  76           &=& \sigma_{1,m} +
  77           \frac {[~X_{1,m} - m x_{m+1}~]^2}{m(m+1)}
  78           \end{aligned}
  79           :label: eqnsimplevar0
  80
  81 where we have used the relation
  82
  83 .. math:: X_{1,m+1} ~=~  X_{1,m} + x_{m+1}
  84           :label: eqnsimplevar1
  85
  86 Using formulae (:eq:`eqn. %s <eqnsimplevar0>`) and
  87 (:eq:`eqn. %s <eqnsimplevar1>`) the average
  88
  89 .. math:: \left<x\right> ~=~ \frac{X_{1,N_x}}{N_x}
  90
  91 and the fluctuation
  92
  93 .. math:: \left<(\Delta x)^2\right>^{{\frac{1}{2}}} = \left[\frac {\sigma_{1,N_x}}{N_x}\right]^{{\frac{1}{2}}}
  94
  95 can be obtained by one sweep through the data.
  96
  97 Implementation
  98 --------------
  99
 100 In |Gromacs| the instantaneous energies :math:`E(m)` are stored in the
 101 :ref:`energy file <edr>`, along with the values of :math:`\sigma_{1,m}` and
 102 :math:`X_{1,m}`. Although the steps are counted from 0, for the energy
 103 and fluctuations steps are counted from 1. This means that the equations
 104 presented here are the ones that are implemented. We give somewhat
 105 lengthy derivations in this section to simplify checking of code and
 106 equations later on.
 107
 108 Part of a Simulation
 109 ~~~~~~~~~~~~~~~~~~~~
 110
 111 It is not uncommon to perform a simulation where the first part, *e.g.*
 112 100 ps, is taken as equilibration. However, the averages and
 113 fluctuations as printed in the :ref:`log file <log>` are computed over the whole
 114 simulation. The equilibration time, which is now part of the simulation,
 115 may in such a case invalidate the averages and fluctuations, because
 116 these numbers are now dominated by the initial drift towards
 117 equilibrium.
 118
 119 Using :eq:`eqns. %s <eqnXpartial>` and
 120 :eq:`%s <eqnvarpartial>` the average and standard deviation
 121 over part of the trajectory can be computed as:
 122
 123 .. math::
 124
 125    \begin{aligned}
 126    X_{m+1,m+k}     &=& X_{1,m+k} - X_{1,m}                 \\
 127    \sigma_{m+1,m+k} &=& \sigma_{1,m+k}-\sigma_{1,m} - \left[~\frac{X_{1,m}}{m} - \frac{X_{1,m+k}}{m+k}~\right]^{2}~ \frac{m(m+k)}{k}\end{aligned}
 128
 129 or, more generally (with :math:`p \geq 1` and :math:`q \geq p`):
 130
 131 .. math::
 132
 133    \begin{aligned}
 134    X_{p,q}         &=&     X_{1,q} - X_{1,p-1}     \\
 135    \sigma_{p,q}    &=&     \sigma_{1,q}-\sigma_{1,p-1} - \left[~\frac{X_{1,p-1}}{p-1} - \frac{X_{1,q}}{q}~\right]^{2}~ \frac{(p-1)q}{q-p+1}\end{aligned}
 136
 137 **Note** that implementation of this is not entirely trivial, since
 138 energies are not stored every time step of the simulation. We therefore
 139 have to construct :math:`X_{1,p-1}` and :math:`\sigma_{1,p-1}` from the
 140 information at time :math:`p` using
 141 :eq:`eqns. %s <eqnsimplevar0>` and
 142 :eq:`%s <eqnsimplevar1>`:
 143
 144 .. math::
 145
 146    \begin{aligned}
 147    X_{1,p-1}       &=&     X_{1,p} - x_p   \\
 148    \sigma_{1,p-1}  &=&     \sigma_{1,p} -  \frac {[~X_{1,p-1} - (p-1) x_{p}~]^2}{(p-1)p}\end{aligned}
 149
 150 Combining two simulations
 151 ~~~~~~~~~~~~~~~~~~~~~~~~~
 152
 153 Another frequently occurring problem is, that the fluctuations of two
 154 simulations must be combined. Consider the following example: we have
 155 two simulations (A) of :math:`n` and (B) of :math:`m` steps, in which
 156 the second simulation is a continuation of the first. However, the
 157 second simulation starts numbering from 1 instead of from :math:`n+1`.
 158 For the partial sum this is no problem, we have to add :math:`X_{1,n}^A`
 159 from run A:
 160
 161 .. math::  X_{1,n+m}^{AB} ~=~ X_{1,n}^A + X_{1,m}^B
 162            :label: eqnpscomb
 163
 164 When we want to compute the partial variance from the two components we
 165 have to make a correction :math:`\Delta\sigma`:
 166
 167 .. math:: \sigma_{1,n+m}^{AB} ~=~ \sigma_{1,n}^A + \sigma_{1,m}^B +\Delta\sigma
 168
 169 if we define :math:`x_i^{AB}` as the combined and renumbered set of
 170 data points we can write:
 171
 172 .. math:: \sigma_{1,n+m}^{AB} ~=~ \sum_{i=1}^{n+m}  \left[x_i^{AB} - \frac{X_{1,n+m}^{AB}}{n+m}\right]^2
 173
 174 and thus
 175
 176 .. math::
 177
 178    \sum_{i=1}^{n+m}  \left[x_i^{AB} - \frac{X_{1,n+m}^{AB}}{n+m}\right]^2  ~=~
 179    \sum_{i=1}^{n}  \left[x_i^{A} - \frac{X_{1,n}^{A}}{n}\right]^2  +
 180    \sum_{i=1}^{m}  \left[x_i^{B} - \frac{X_{1,m}^{B}}{m}\right]^2  +\Delta\sigma
 181
 182 or
 183
 184 .. math::
 185
 186    \begin{aligned}
 187    \sum_{i=1}^{n+m}  \left[(x_i^{AB})^2 - 2 x_i^{AB}\frac{X^{AB}_{1,n+m}}{n+m} + \left(\frac{X^{AB}_{1,n+m}}{n+m}\right)^2  \right] &-& \nonumber \\
 188    \sum_{i=1}^{n}  \left[(x_i^{A})^2 - 2 x_i^{A}\frac{X^A_{1,n}}{n} + \left(\frac{X^A_{1,n}}{n}\right)^2  \right] &-& \nonumber \\
 189    \sum_{i=1}^{m}  \left[(x_i^{B})^2 - 2 x_i^{B}\frac{X^B_{1,m}}{m} + \left(\frac{X^B_{1,m}}{m}\right)^2  \right] &=& \Delta\sigma\end{aligned}
 190
 191 all the :math:`x_i^2` terms drop out, and the terms independent of the
 192 summation counter :math:`i` can be simplified:
 193
 194 .. math::
 195
 196    \begin{aligned}
 197    \frac{\left(X^{AB}_{1,n+m}\right)^2}{n+m} \,-\,
 198    \frac{\left(X^A_{1,n}\right)^2}{n} \,-\,
 199    \frac{\left(X^B_{1,m}\right)^2}{m} &-& \nonumber \\
 200    2\,\frac{X^{AB}_{1,n+m}}{n+m}\sum_{i=1}^{n+m}x_i^{AB} \,+\,
 201    2\,\frac{X^{A}_{1,n}}{n}\sum_{i=1}^{n}x_i^{A} \,+\,
 202    2\,\frac{X^{B}_{1,m}}{m}\sum_{i=1}^{m}x_i^{B} &=& \Delta\sigma\end{aligned}
 203
 204 we recognize the three partial sums on the second line and use
 205 :eq:`eqn. %s <eqnpscomb>` to obtain:
 206
 207 .. math:: \Delta\sigma ~=~ \frac{\left(mX^A_{1,n} - nX^B_{1,m}\right)^2}{nm(n+m)}
 208
 209 if we check this by inserting :math:`m=1` we get back
 210 :eq:`eqn. %s <eqnsimplevar0>`
 211
 212 Summing energy terms
 213 ~~~~~~~~~~~~~~~~~~~~
 214
 215 The :ref:`gmx energy <gmx energy>` program
 216 can also sum energy terms into one, *e.g.* potential + kinetic = total.
 217 For the partial averages this is again easy if we have :math:`S` energy
 218 components :math:`s`:
 219
 220 .. math::  X_{m,n}^S ~=~ \sum_{i=m}^n \sum_{s=1}^S x_i^s ~=~ \sum_{s=1}^S \sum_{i=m}^n x_i^s ~=~ \sum_{s=1}^S X_{m,n}^s
 221            :label: eqnsumterms
 222
 223 For the fluctuations it is less trivial again, considering for example
 224 that the fluctuation in potential and kinetic energy should cancel.
 225 Nevertheless we can try the same approach as before by writing:
 226
 227 .. math:: \sigma_{m,n}^S ~=~ \sum_{s=1}^S \sigma_{m,n}^s + \Delta\sigma
 228
 229 if we fill in :eq:`eqn. %s <eqnsigma>`:
 230
 231 .. math:: \sum_{i=m}^n \left[\left(\sum_{s=1}^S x_i^s\right) - \frac{X_{m,n}^S}{m-n+1}\right]^2 ~=~
 232           \sum_{s=1}^S \sum_{i=m}^n \left[\left(x_i^s\right) - \frac{X_{m,n}^s}{m-n+1}\right]^2 + \Delta\sigma
 233           :label: eqnsigmaterms
 234
 235 which we can expand to:
 236
 237 .. math::
 238
 239    \begin{aligned}
 240    &~&\sum_{i=m}^n \left[\sum_{s=1}^S (x_i^s)^2 + \left(\frac{X_{m,n}^S}{m-n+1}\right)^2 -2\left(\frac{X_{m,n}^S}{m-n+1}\sum_{s=1}^S x_i^s + \sum_{s=1}^S \sum_{s'=s+1}^S x_i^s x_i^{s'} \right)\right]    \nonumber \\
 241    &-&\sum_{s=1}^S \sum_{i=m}^n \left[(x_i^s)^2 - 2\,\frac{X_{m,n}^s}{m-n+1}\,x_i^s + \left(\frac{X_{m,n}^s}{m-n+1}\right)^2\right] ~=~\Delta\sigma \end{aligned}
 242
 243 the terms with :math:`(x_i^s)^2` cancel, so that we can simplify to:
 244
 245 .. math::
 246
 247    \begin{aligned}
 248    &~&\frac{\left(X_{m,n}^S\right)^2}{m-n+1} -2 \frac{X_{m,n}^S}{m-n+1}\sum_{i=m}^n\sum_{s=1}^S x_i^s -2\sum_{i=m}^n\sum_{s=1}^S \sum_{s'=s+1}^S x_i^s x_i^{s'}\, -        \nonumber \\
 249    &~&\sum_{s=1}^S \sum_{i=m}^n \left[- 2\,\frac{X_{m,n}^s}{m-n+1}\,x_i^s + \left(\frac{X_{m,n}^s}{m-n+1}\right)^2\right] ~=~\Delta\sigma \end{aligned}
 250
 251 or
 252
 253 .. math:: -\frac{\left(X_{m,n}^S\right)^2}{m-n+1}  -2\sum_{i=m}^n\sum_{s=1}^S \sum_{s'=s+1}^S x_i^s x_i^{s'}\, +  \sum_{s=1}^S \frac{\left(X_{m,n}^s\right)^2}{m-n+1}  ~=~\Delta\sigma
 254
 255 If we now expand the first term using
 256 :eq:`eqn. %s <eqnsumterms>` we obtain:
 257
 258 .. math:: -\frac{\left(\sum_{s=1}^SX_{m,n}^s\right)^2}{m-n+1}  -2\sum_{i=m}^n\sum_{s=1}^S \sum_{s'=s+1}^S x_i^s x_i^{s'}\, +      \sum_{s=1}^S \frac{\left(X_{m,n}^s\right)^2}{m-n+1}  ~=~\Delta\sigma
 259
 260 which we can reformulate to:
 261
 262 .. math:: -2\left[\sum_{s=1}^S \sum_{s'=s+1}^S X_{m,n}^s X_{m,n}^{s'}\,+\sum_{i=m}^n\sum_{s=1}^S \sum_{s'=s+1}^S x_i^s x_i^{s'}\right] ~=~\Delta\sigma
 263
 264 or
 265
 266 .. math:: -2\left[\sum_{s=1}^S X_{m,n}^s \sum_{s'=s+1}^S X_{m,n}^{s'}\,+\,\sum_{s=1}^S \sum_{i=m}^nx_i^s \sum_{s'=s+1}^S x_i^{s'}\right] ~=~\Delta\sigma
 267
 268 which gives
 269
 270 .. math:: -2\sum_{s=1}^S \left[X_{m,n}^s \sum_{s'=s+1}^S \sum_{i=m}^n x_i^{s'}\,+\,\sum_{i=m}^n x_i^s \sum_{s'=s+1}^S x_i^{s'}\right] ~=~\Delta\sigma
 271
 272 Since we need all data points :math:`i` to evaluate this, in general
 273 this is not possible. We can then make an estimate of
 274 :math:`\sigma_{m,n}^S` using only the data points that are available
 275 using the left hand side of :eq:`eqn. %s <eqnsigmaterms>`.
 276 While the average can be computed using all time steps in the
 277 simulation, the accuracy of the fluctuations is thus limited by the
 278 frequency with which energies are saved. Since this can be easily done
 279 with a program such as ``xmgr`` this is not
 280 built-in in |Gromacs|.
 281
 282 .. raw:: latex
 283
 284     \clearpage
 285
 286