docs/reference-manual/averages.rst

   1 Averages and fluctuations
   2 =========================
   3
   4 Formulae for averaging
   5 ----------------------
   6
   7 **Note:** this section was taken from ref \ :ref:`179 <refGunsteren94a>`.
   8
   9 When analyzing a MD trajectory averages :math:`\left<x\right>` and
  10 fluctuations
  11
  12 .. math::  \left<(\Delta x)^2\right>^{{\frac{1}{2}}} ~=~ \left<[x-\left<x\right>]^2\right>^{{\frac{1}{2}}}
  13            :label: eqnvar0
  14
  15 of a quantity :math:`x` are to be computed. The variance
  16 :math:`\sigma_x` of a series of N\ :math:`_x` values, {x:math:`_i`}, can
  17 be computed from
  18
  19 .. math:: \sigma_x~=~ \sum_{i=1}^{N_x} x_i^2 ~-~  \frac{1}{N_x}\left(\sum_{i=1}^{N_x}x_i\right)^2
  20           :label: eqnvar1
  21
  22 Unfortunately this formula is numerically not very accurate, especially
  23 when :math:`\sigma_x^{{\frac{1}{2}}}` is small compared to the values of
  24 :math:`x_i`. The following (equivalent) expression is numerically more
  25 accurate
  26
  27 .. math:: \sigma_x ~=~ \sum_{i=1}^{N_x} [x_i  - \left<x\right>]^2
  28           :label: eqnvar1equivalent
  29
  30 with
  31
  32 .. math:: \left<x\right> ~=~ \frac{1}{N_x} \sum_{i=1}^{N_x} x_i
  33           :label: eqnvar2
  34
  35 Using :eq:`eqns. %s <eqnvar1>` and
  36 :eq:`%s <eqnvar2>` one has to go through the series of
  37 :math:`x_i` values twice, once to determine :math:`\left<x\right>` and
  38 again to compute :math:`\sigma_x`, whereas
  39 :eq:`eqn. %s <eqnvar0>` requires only one sequential scan of
  40 the series {x:math:`_i`}. However, one may cast
  41 :eq:`eqn. %s <eqnvar1>` in another form, containing partial
  42 sums, which allows for a sequential update algorithm. Define the partial
  43 sum
  44
  45 .. math:: X_{n,m} ~=~ \sum_{i=n}^{m} x_i
  46           :label: eqnpartialsum
  47
  48 and the partial variance
  49
  50 .. math::  \sigma_{n,m} ~=~ \sum_{i=n}^{m}  \left[x_i - \frac{X_{n,m}}{m-n+1}\right]^2
  51            :label: eqnsigma
  52
  53 It can be shown that
  54
  55 .. math::  X_{n,m+k} ~=~  X_{n,m} + X_{m+1,m+k}
  56            :label: eqnXpartial
  57
  58 and
  59
  60 .. math:: \begin{aligned}
  61           \sigma_{n,m+k} &=& \sigma_{n,m} + \sigma_{m+1,m+k} + \left[~\frac {X_{n,m}}{m-n+1} - \frac{X_{n,m+k}}{m+k-n+1}~\right]^2~* \nonumber\\
  62           && ~\frac{(m-n+1)(m+k-n+1)}{k}
  63           \end{aligned}
  64           :label: eqnvarpartial
  65
  66 For :math:`n=1` one finds
  67
  68 .. math:: \sigma_{1,m+k} ~=~ \sigma_{1,m} + \sigma_{m+1,m+k}~+~
  69           \left[~\frac{X_{1,m}}{m} - \frac{X_{1,m+k}}{m+k}~\right]^2~ \frac{m(m+k)}{k}
  70           :label: eqnsig1
  71
  72 and for :math:`n=1` and :math:`k=1`
  73 :eq:`eqn. %s <eqnvarpartial>` becomes
  74
  75 .. math:: \begin{aligned}
  76           \sigma_{1,m+1}  &=& \sigma_{1,m} +
  77           \left[\frac{X_{1,m}}{m} - \frac{X_{1,m+1}}{m+1}\right]^2 m(m+1)\\
  78           &=& \sigma_{1,m} +
  79           \frac {[~X_{1,m} - m x_{m+1}~]^2}{m(m+1)}
  80           \end{aligned}
  81           :label: eqnsimplevar0
  82
  83 where we have used the relation
  84
  85 .. math:: X_{1,m+1} ~=~  X_{1,m} + x_{m+1}
  86           :label: eqnsimplevar1
  87
  88 Using formulae :eq:`eqn. %s <eqnsimplevar0>` and
  89 :eq:`eqn. %s <eqnsimplevar1>` the average
  90
  91 .. math:: \left<x\right> ~=~ \frac{X_{1,N_x}}{N_x}
  92           :label: eqnfinalaverage
  93
  94 and the fluctuation
  95
  96 .. math:: \left<(\Delta x)^2\right>^{{\frac{1}{2}}} = \left[\frac {\sigma_{1,N_x}}{N_x}\right]^{{\frac{1}{2}}}
  97           :label: eqnfinalfluctuation
  98
  99 can be obtained by one sweep through the data.
 100
 101 Implementation
 102 --------------
 103
 104 In |Gromacs| the instantaneous energies :math:`E(m)` are stored in the
 105 :ref:`energy file <edr>`, along with the values of :math:`\sigma_{1,m}` and
 106 :math:`X_{1,m}`. Although the steps are counted from 0, for the energy
 107 and fluctuations steps are counted from 1. This means that the equations
 108 presented here are the ones that are implemented. We give somewhat
 109 lengthy derivations in this section to simplify checking of code and
 110 equations later on.
 111
 112 Part of a Simulation
 113 ~~~~~~~~~~~~~~~~~~~~
 114
 115 It is not uncommon to perform a simulation where the first part, *e.g.*
 116 100 ps, is taken as equilibration. However, the averages and
 117 fluctuations as printed in the :ref:`log file <log>` are computed over the whole
 118 simulation. The equilibration time, which is now part of the simulation,
 119 may in such a case invalidate the averages and fluctuations, because
 120 these numbers are now dominated by the initial drift towards
 121 equilibrium.
 122
 123 Using :eq:`eqns. %s <eqnXpartial>` and
 124 :eq:`%s <eqnvarpartial>` the average and standard deviation
 125 over part of the trajectory can be computed as:
 126
 127 .. math:: \begin{aligned}
 128           X_{m+1,m+k}     &=& X_{1,m+k} - X_{1,m}                 \\
 129           \sigma_{m+1,m+k} &=& \sigma_{1,m+k}-\sigma_{1,m} - \left[~\frac{X_{1,m}}{m} - \frac{X_{1,m+k}}{m+k}~\right]^{2}~ \frac{m(m+k)}{k}\end{aligned}
 130           :label: eqnaveragesimpart
 131
 132 or, more generally (with :math:`p \geq 1` and :math:`q \geq p`):
 133
 134 .. math:: \begin{aligned}
 135           X_{p,q}         &=&     X_{1,q} - X_{1,p-1}     \\
 136           \sigma_{p,q}    &=&     \sigma_{1,q}-\sigma_{1,p-1} - \left[~\frac{X_{1,p-1}}{p-1} - \frac{X_{1,q}}{q}~\right]^{2}~ \frac{(p-1)q}{q-p+1}\end{aligned}
 137           :label: eqnaveragesimpartgeneral
 138
 139 **Note** that implementation of this is not entirely trivial, since
 140 energies are not stored every time step of the simulation. We therefore
 141 have to construct :math:`X_{1,p-1}` and :math:`\sigma_{1,p-1}` from the
 142 information at time :math:`p` using :eq:`eqns. %s <eqnsimplevar0>` and
 143 :eq:`%s <eqnsimplevar1>`:
 144
 145 .. math:: \begin{aligned}
 146           X_{1,p-1}       &=&     X_{1,p} - x_p   \\
 147           \sigma_{1,p-1}  &=&     \sigma_{1,p} -  \frac {[~X_{1,p-1} - (p-1) x_{p}~]^2}{(p-1)p}\end{aligned}
 148           :label: eqnfinalaveragesimpartnote
 149
 150 Combining two simulations
 151 ~~~~~~~~~~~~~~~~~~~~~~~~~
 152
 153 Another frequently occurring problem is, that the fluctuations of two
 154 simulations must be combined. Consider the following example: we have
 155 two simulations (A) of :math:`n` and (B) of :math:`m` steps, in which
 156 the second simulation is a continuation of the first. However, the
 157 second simulation starts numbering from 1 instead of from :math:`n+1`.
 158 For the partial sum this is no problem, we have to add :math:`X_{1,n}^A`
 159 from run A:
 160
 161 .. math::  X_{1,n+m}^{AB} ~=~ X_{1,n}^A + X_{1,m}^B
 162            :label: eqnpscomb
 163
 164 When we want to compute the partial variance from the two components we
 165 have to make a correction :math:`\Delta\sigma`:
 166
 167 .. math:: \sigma_{1,n+m}^{AB} ~=~ \sigma_{1,n}^A + \sigma_{1,m}^B +\Delta\sigma
 168           :label: eqnscombcorr
 169
 170 if we define :math:`x_i^{AB}` as the combined and renumbered set of
 171 data points we can write:
 172
 173 .. math:: \sigma_{1,n+m}^{AB} ~=~ \sum_{i=1}^{n+m}  \left[x_i^{AB} - \frac{X_{1,n+m}^{AB}}{n+m}\right]^2
 174           :label: eqnpscombpoints
 175
 176 and thus
 177
 178 .. math:: \sum_{i=1}^{n+m}  \left[x_i^{AB} - \frac{X_{1,n+m}^{AB}}{n+m}\right]^2  ~=~
 179           \sum_{i=1}^{n}  \left[x_i^{A} - \frac{X_{1,n}^{A}}{n}\right]^2  +
 180           \sum_{i=1}^{m}  \left[x_i^{B} - \frac{X_{1,m}^{B}}{m}\right]^2  +\Delta\sigma
 181           :label: eqnpscombresult
 182
 183 or
 184
 185 .. math:: \begin{aligned}
 186           \sum_{i=1}^{n+m}  \left[(x_i^{AB})^2 - 2 x_i^{AB}\frac{X^{AB}_{1,n+m}}{n+m} + \left(\frac{X^{AB}_{1,n+m}}{n+m}\right)^2  \right] &-& \nonumber \\
 187           \sum_{i=1}^{n}  \left[(x_i^{A})^2 - 2 x_i^{A}\frac{X^A_{1,n}}{n} + \left(\frac{X^A_{1,n}}{n}\right)^2  \right] &-& \nonumber \\
 188           \sum_{i=1}^{m}  \left[(x_i^{B})^2 - 2 x_i^{B}\frac{X^B_{1,m}}{m} + \left(\frac{X^B_{1,m}}{m}\right)^2  \right] &=& \Delta\sigma\end{aligned}
 189           :label: eqnpscombresult2
 190
 191 all the :math:`x_i^2` terms drop out, and the terms independent of the
 192 summation counter :math:`i` can be simplified:
 193
 194 .. math:: \begin{aligned}
 195           \frac{\left(X^{AB}_{1,n+m}\right)^2}{n+m} \,-\,
 196           \frac{\left(X^A_{1,n}\right)^2}{n} \,-\,
 197           \frac{\left(X^B_{1,m}\right)^2}{m} &-& \nonumber \\
 198           2\,\frac{X^{AB}_{1,n+m}}{n+m}\sum_{i=1}^{n+m}x_i^{AB} \,+\,
 199           2\,\frac{X^{A}_{1,n}}{n}\sum_{i=1}^{n}x_i^{A} \,+\,
 200           2\,\frac{X^{B}_{1,m}}{m}\sum_{i=1}^{m}x_i^{B} &=& \Delta\sigma\end{aligned}
 201           :label: eqnpscombsimp
 202
 203 we recognize the three partial sums on the second line and use
 204 :eq:`eqn. %s <eqnpscomb>` to obtain:
 205
 206 .. math:: \Delta\sigma ~=~ \frac{\left(mX^A_{1,n} - nX^B_{1,m}\right)^2}{nm(n+m)}
 207           :label: eqnpscombused
 208
 209 if we check this by inserting :math:`m=1` we get back
 210 :eq:`eqn. %s <eqnsimplevar0>`
 211
 212 Summing energy terms
 213 ~~~~~~~~~~~~~~~~~~~~
 214
 215 The :ref:`gmx energy <gmx energy>` program
 216 can also sum energy terms into one, *e.g.* potential + kinetic = total.
 217 For the partial averages this is again easy if we have :math:`S` energy
 218 components :math:`s`:
 219
 220 .. math::  X_{m,n}^S ~=~ \sum_{i=m}^n \sum_{s=1}^S x_i^s ~=~ \sum_{s=1}^S \sum_{i=m}^n x_i^s ~=~ \sum_{s=1}^S X_{m,n}^s
 221            :label: eqnsumterms
 222
 223 For the fluctuations it is less trivial again, considering for example
 224 that the fluctuation in potential and kinetic energy should cancel.
 225 Nevertheless we can try the same approach as before by writing:
 226
 227 .. math:: \sigma_{m,n}^S ~=~ \sum_{s=1}^S \sigma_{m,n}^s + \Delta\sigma
 228           :label: eqnsigmatermsfluct
 229
 230 if we fill in :eq:`eqn. %s <eqnsigma>`:
 231
 232 .. math:: \sum_{i=m}^n \left[\left(\sum_{s=1}^S x_i^s\right) - \frac{X_{m,n}^S}{m-n+1}\right]^2 ~=~
 233           \sum_{s=1}^S \sum_{i=m}^n \left[\left(x_i^s\right) - \frac{X_{m,n}^s}{m-n+1}\right]^2 + \Delta\sigma
 234           :label: eqnsigmaterms
 235
 236 which we can expand to:
 237
 238 .. math:: \begin{aligned}
 239           &~&\sum_{i=m}^n \left[\sum_{s=1}^S (x_i^s)^2 + \left(\frac{X_{m,n}^S}{m-n+1}\right)^2 -2\left(\frac{X_{m,n}^S}{m-n+1}\sum_{s=1}^S x_i^s + \sum_{s=1}^S \sum_{s'=s+1}^S x_i^s x_i^{s'} \right)\right]    \nonumber \\
 240           &-&\sum_{s=1}^S \sum_{i=m}^n \left[(x_i^s)^2 - 2\,\frac{X_{m,n}^s}{m-n+1}\,x_i^s + \left(\frac{X_{m,n}^s}{m-n+1}\right)^2\right] ~=~\Delta\sigma \end{aligned}
 241           :label: eqnsimtermsexpanded
 242
 243 the terms with :math:`(x_i^s)^2` cancel, so that we can simplify to:
 244
 245 .. math:: \begin{aligned}
 246           &~&\frac{\left(X_{m,n}^S\right)^2}{m-n+1} -2 \frac{X_{m,n}^S}{m-n+1}\sum_{i=m}^n\sum_{s=1}^S x_i^s -2\sum_{i=m}^n\sum_{s=1}^S \sum_{s'=s+1}^S x_i^s x_i^{s'}\, -        \nonumber \\
 247           &~&\sum_{s=1}^S \sum_{i=m}^n \left[- 2\,\frac{X_{m,n}^s}{m-n+1}\,x_i^s + \left(\frac{X_{m,n}^s}{m-n+1}\right)^2\right] ~=~\Delta\sigma \end{aligned}
 248           :label: eqnsigmatermssimplefied
 249
 250 or
 251
 252 .. math:: -\frac{\left(X_{m,n}^S\right)^2}{m-n+1}  -2\sum_{i=m}^n\sum_{s=1}^S \sum_{s'=s+1}^S x_i^s x_i^{s'}\, +  \sum_{s=1}^S \frac{\left(X_{m,n}^s\right)^2}{m-n+1}  ~=~\Delta\sigma
 253            :label: eqnsigmatermsalternative
 254
 255 If we now expand the first term using
 256 :eq:`eqn. %s <eqnsumterms>` we obtain:
 257
 258 .. math:: -\frac{\left(\sum_{s=1}^SX_{m,n}^s\right)^2}{m-n+1}  -2\sum_{i=m}^n\sum_{s=1}^S \sum_{s'=s+1}^S x_i^s x_i^{s'}\, +      \sum_{s=1}^S \frac{\left(X_{m,n}^s\right)^2}{m-n+1}  ~=~\Delta\sigma
 259           :label: eqnsigmatermsfirstexpand
 260
 261 which we can reformulate to:
 262
 263 .. math:: -2\left[\sum_{s=1}^S \sum_{s'=s+1}^S X_{m,n}^s X_{m,n}^{s'}\,+\sum_{i=m}^n\sum_{s=1}^S \sum_{s'=s+1}^S x_i^s x_i^{s'}\right] ~=~\Delta\sigma
 264           :label: eqnsigmatermsreformed
 265
 266 or
 267
 268 .. math:: -2\left[\sum_{s=1}^S X_{m,n}^s \sum_{s'=s+1}^S X_{m,n}^{s'}\,+\,\sum_{s=1}^S \sum_{i=m}^nx_i^s \sum_{s'=s+1}^S x_i^{s'}\right] ~=~\Delta\sigma
 269           :label: eqnsigmatermsreformedalternative
 270
 271 which gives
 272
 273 .. math:: -2\sum_{s=1}^S \left[X_{m,n}^s \sum_{s'=s+1}^S \sum_{i=m}^n x_i^{s'}\,+\,\sum_{i=m}^n x_i^s \sum_{s'=s+1}^S x_i^{s'}\right] ~=~\Delta\sigma
 274           :label: eqnsigmatermsfinal
 275
 276 Since we need all data points :math:`i` to evaluate this, in general
 277 this is not possible. We can then make an estimate of
 278 :math:`\sigma_{m,n}^S` using only the data points that are available
 279 using the left hand side of :eq:`eqn. %s <eqnsigmaterms>`.
 280 While the average can be computed using all time steps in the
 281 simulation, the accuracy of the fluctuations is thus limited by the
 282 frequency with which energies are saved. Since this can be easily done
 283 with a program such as ``xmgr`` this is not
 284 built-in in |Gromacs|.
 285
 286 .. raw:: latex
 287
 288     \clearpage
 289
 290