docs/manual/analyse.tex

   1 %
   2 % This file is part of the GROMACS molecular simulation package.
   3 %
   4 % Copyright (c) 2013,2014, by the GROMACS development team, led by
   5 % Mark Abraham, David van der Spoel, Berk Hess, and Erik Lindahl,
   6 % and including many others, as listed in the AUTHORS file in the
   7 % top-level source directory and at http://www.gromacs.org.
   8 %
   9 % GROMACS is free software; you can redistribute it and/or
  10 % modify it under the terms of the GNU Lesser General Public License
  11 % as published by the Free Software Foundation; either version 2.1
  12 % of the License, or (at your option) any later version.
  13 %
  14 % GROMACS is distributed in the hope that it will be useful,
  15 % but WITHOUT ANY WARRANTY; without even the implied warranty of
  16 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
  17 % Lesser General Public License for more details.
  18 %
  19 % You should have received a copy of the GNU Lesser General Public
  20 % License along with GROMACS; if not, see
  21 % http://www.gnu.org/licenses, or write to the Free Software Foundation,
  22 % Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA.
  23 %
  24 % If you want to redistribute modifications to GROMACS, please
  25 % consider that scientific software is very special. Version
  26 % control is crucial - bugs must be traceable. We will be happy to
  27 % consider code for inclusion in the official distribution, but
  28 % derived work must not be called official GROMACS. Details are found
  29 % in the README & COPYING files - if they are missing, get the
  30 % official version at http://www.gromacs.org.
  31 %
  32 % To help us fund GROMACS development, we humbly ask that you cite
  33 % the research papers on the package. Check out http://www.gromacs.org.
  34
  35 \chapter{Analysis}
  36 \label{ch:analysis}
  37 In this chapter different ways of analyzing your trajectory are described.
  38 The names of the corresponding analysis programs are given.
  39 Specific information on the in- and output of these programs can be found
  40 in the online manual at {\wwwpage}.
  41 The output files are often produced as finished Grace/Xmgr graphs.
  42
  43 First, in \secref{usinggroups}, the group concept in analysis is explained.
  44 \ssecref{selections} explains a newer concept of dynamic selections,
  45 which is currently supported by a few tools.
  46 Then, the different analysis tools are presented.
  47
  48 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Groups in Analysis
  49
  50 \section{Using Groups}
  51 \label{sec:usinggroups}\index{groups}
  52 {\tt gmx make_ndx, gmx mk_angndx, gmx select}\\
  53 In \chref{algorithms}, it was explained how {\em groups of
  54 atoms} can be used in {\tt mdrun} (see~\secref{groupconcept}).
  55 In most analysis programs, groups
  56 of atoms must also be chosen. Most programs can generate several default
  57 index groups, but groups can always be read from an index file. Let's
  58 consider the example of a simulation of a binary mixture of components A and B. When
  59 we want to calculate the radial distribution function (RDF)
  60 $g_{AB}(r)$ of A with respect to B, we have to calculate:
  61 \beq
  62 4\pi r^2 g_{AB}(r)      ~=~     V~\sum_{i \in A}^{N_A} \sum_{j \in B}^{N_B} P(r)
  63 \eeq
  64 where $V$ is the volume and $P(r)$ is the probability of finding a B atom
  65 at distance $r$ from an A atom.
  66
  67 By having the user define the {\em atom numbers} for groups A and B in
  68 a simple file, we can calculate this $g_{AB}$ in the most general way, without
  69 having to make any assumptions in the RDF program about the type of
  70 particles.
  71
  72 Groups can therefore consist of a series of {\em atom numbers}, but in
  73 some cases also of {\em molecule numbers}.  It is also possible to
  74 specify a series of angles by {\em triples} of {\em atom numbers},
  75 dihedrals by {\em quadruples} of {\em atom numbers} and bonds or
  76 vectors (in a molecule) by {\em pairs} of {\em atom numbers}. When
  77 appropriate the type of index file will be specified for the following
  78 analysis programs.  To help creating such \swapindex{index}{file}s ({\tt
  79 index.ndx}), there are a couple of programs to generate them, using
  80 either your input configuration or the topology.  To generate an
  81 index file consisting of a series of {\em atom numbers} (as in the
  82 example of $g_{AB}$), use {\tt \normindex{gmx make_ndx}} or
  83 {\tt \normindex{gmx select}}.  To generate an index file
  84 with angles or dihedrals, use {\tt \normindex{gmx mk_angndx}}.
  85 Of course you can also
  86 make them by hand. The general format is presented here:
  87
  88 \begin{verbatim}
  89 [ Oxygen ]
  90    1       4       7
  91
  92 [ Hydrogen ]
  93    2       3       5       6
  94    8       9
  95 \end{verbatim}
  96
  97 First, the group name is written between square brackets. The following
  98 atom numbers may be spread out over as many lines as you like. The
  99 atom numbering starts at 1.
 100
 101 \swapindexquiet{choosing}{groups}%
 102 Each tool that can use groups will offer the available alternatives
 103 for the user to choose. That choice can be made with the number of the
 104 group, or its name. In fact, the first few letters of the group
 105 name will suffice if that will distinguish the group from all others.
 106 There are ways to use Unix shell features to choose group names
 107 on the command line, rather than interactively. Consult {\wwwpage}
 108 for suggestions.
 109
 110 \subsection{Default Groups}
 111 \label{subsec:defaultgroups}
 112 When no index file is supplied to analysis tools or {\tt grompp},
 113 a number of \swapindex{default}{groups} are generated to choose from:
 114 \begin{description}
 115 \item[{\tt System}]\mbox{}\\
 116         all atoms in the system
 117 \item[{\tt Protein}]\mbox{}\\
 118         all protein atoms
 119 \item[{\tt Protein-H}]\mbox{}\\
 120         protein atoms excluding hydrogens
 121 \item[{\tt C-alpha}]\mbox{}\\
 122         C$_{\alpha}$ atoms
 123 \item[{\tt Backbone}]\mbox{}\\
 124         protein backbone atoms; N, C$_{\alpha}$ and C
 125 \item[{\tt MainChain}]\mbox{}\\
 126         protein main chain atoms: N, C$_{\alpha}$, C and O, including
 127         oxygens in C-terminus
 128 \item[{\tt MainChain+Cb}]\mbox{}\\
 129         protein main chain atoms including C$_{\beta}$
 130 \item[{\tt MainChain+H}]\mbox{}\\
 131         protein main chain atoms including backbone amide hydrogens and
 132         hydrogens on the N-terminus
 133 \item[{\tt SideChain}]\mbox{}\\
 134         protein side chain atoms; that is all atoms except N,
 135         C$_{\alpha}$, C, O, backbone amide hydrogens, oxygens in
 136         C-terminus and hydrogens on the N-terminus
 137 \item[{\tt SideChain-H}]\mbox{}\\
 138         protein side chain atoms excluding all hydrogens
 139 %\ifthenelse{\equal{\gmxlite}{1}}{}{
 140 \item[{\tt Prot-Masses}]\mbox{}\\
 141         protein atoms excluding dummy masses (as used in virtual site
 142         constructions of NH$_3$ groups and tryptophan side-chains),
 143         see also \secref{vsitetop}; this group is only included when
 144         it differs from the ``{\tt Protein}'' group
 145 %} % Brace matches ifthenelse test for gmxlite
 146 \item[{\tt Non-Protein}]\mbox{}\\
 147         all non-protein atoms
 148 \item[{\tt DNA}]\mbox{}\\
 149         all DNA atoms
 150 \item[{\tt RNA}]\mbox{}\\
 151         all RNA atoms
 152 \item[{\tt Water}]\mbox{}\\
 153         water molecules (names like {\tt SOL}, {\tt WAT}, {\tt HOH}, etc.)  See
 154         {\tt \normindex{residuetypes.dat}} for a full listing
 155 \item[{\tt non-Water}]\mbox{}\\
 156         anything not covered by the {\tt Water} group
 157 \item[{\tt Ion}]\mbox{}\\
 158         any name matching an Ion entry in {\tt \normindex{residuetypes.dat}}
 159 \item[{\tt Water_and_Ions}]\mbox{}\\
 160         combination of the {\tt Water} and {\tt Ions} groups
 161 \item[{\tt molecule_name}]\mbox{}\\
 162         for all residues/molecules which are not recognized as protein,
 163         DNA, or RNA; one group per residue/molecule name is generated
 164 \item[{\tt Other}]\mbox{}\\
 165         all atoms which are neither protein, DNA, nor RNA.
 166 \end{description}
 167 Empty groups will not be generated.
 168 Most of the groups only contain protein atoms.
 169 An atom is considered a protein atom if its residue name is listed
 170 in the {\tt \normindex{residuetypes.dat}} file and is listed as a
 171 ``Protein'' entry.  The process for determinding DNA, RNA, etc. is
 172 analogous. If you need to modify these classifications, then you
 173 can copy the file from the library directory into your working
 174 directory and edit the local copy.
 175
 176 \subsection{Selections}
 177 \label{subsec:selections}
 178 {\tt \normindex{gmx select}}\\
 179 Currently, a few analysis tools support an extended concept of {\em
 180 (dynamic) \normindex{selections}}.  There are three main differences to
 181 traditional index groups:
 182 \begin{itemize}
 183   \item The selections are specified as text instead of reading fixed
 184     atom indices from a file, using a syntax similar to VMD.  The text
 185     can be entered interactively, provided on the command line, or from
 186     a file.
 187   \item The selections are not restricted to atoms, but can also specify
 188     that the analysis is to be performed on, e.g., center-of-mass
 189     positions of a group of atoms.
 190     Some tools may not support selections that do not evaluate to single
 191     atoms, e.g., if they require information that is available only for
 192     single atoms, like atom names or types.
 193   \item The selections can be dynamic, i.e., evaluate to different atoms
 194     for different trajectory frames.  This allows analyzing only a
 195     subset of the system that satisfies some geometric criteria.
 196 \end{itemize}
 197 As an example of a simple selection,
 198 {\tt resname ABC and within 2 of resname DEF}
 199 selects all atoms in residues named ABC that are within 2\,nm of any
 200 atom in a residue named DEF.
 201
 202 Tools that accept selections can also use traditional index files
 203 similarly to older tools: it is possible to give an {\tt .ndx} file to
 204 the tool, and directly select a group from the index file as a
 205 selection, either by group number or by group name.  The index groups
 206 can also be used as a part of a more complicated selection.
 207
 208 To get started, you can run {\tt gmx select} with a single structure,
 209 and use the interactive prompt to try out different selections.
 210 The tool provides, among others, output options {\tt -on} and
 211 {\tt -ofpdb} to write out the selected atoms to an index file and to a
 212 {\tt .pdb} file, respectively.  This does not allow testing selections
 213 that evaluate to center-of-mass positions, but other selections can be
 214 tested and the result examined.
 215
 216 The detailed syntax and the individual keywords that can be used in
 217 selections can be accessed by typing {\tt help} in the interactive
 218 prompt of any selection-enabled tool, as well as with
 219 {\tt gmx help selections}.  The help is divided into subtopics that can
 220 be accessed with, e.g., {\tt help syntax} / {\tt gmx help selections
 221 syntax}.  Some individual selection keywords have extended help as well,
 222 which can be accessed with, e.g., {\tt help keywords within}.
 223
 224 The interactive prompt does not currently provide much editing
 225 capabilities.  If you need them, you can run the program under
 226 {\tt rlwrap}.
 227
 228 For tools that do not yet support the selection syntax, you can use
 229 {\tt gmx select -on} to generate static index groups to pass to the
 230 tool.  However, this only allows for a small subset (only the first
 231 bullet from the above list) of the flexibility that fully
 232 selection-aware tools offer.
 233
 234 It is also possible to write your own analysis
 235 tools to take advantage of the flexibility of these selections: see the
 236 {\tt template.cpp} file in the {\tt share/gromacs/template} directory of
 237 your installation for an example.
 238
 239 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Looking at your trajectory
 240
 241 \section{Looking at your trajectory}
 242 \label{sec:lookwhostalking}
 243 \begin{figure}
 244 \centerline{
 245 {\includegraphics[width=8cm,angle=90]{plots/ngmxdump}}}
 246 \caption{The window of {\tt gmx view} showing a box of water.}
 247 \label{fig:ngmxdump}
 248 \end{figure}
 249 {\tt gmx view}\\
 250 Before analyzing your trajectory it is often informative to look at
 251 your trajectory first. {\gromacs} comes with a simple trajectory
 252 viewer {\tt \normindex{gmx view}}; the advantage with this one is that it does not
 253 require OpenGL, which usually isn't present on {\eg} supercomputers.
 254 It is also possible to generate a
 255 hard-copy in Encapsulated Postscript format (see
 256 \figref{ngmxdump}). If you want a faster and more fancy viewer
 257  there are several programs
 258 that can read the {\gromacs} trajectory formats -- have a look at our
 259 homepage ({\wwwpage}) for updated links.
 260
 261 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% General properties
 262
 263 \section{General properties}
 264 \label{sec:genprop}
 265 {\tt gmx energy, gmx traj}\\
 266 To analyze some or all {\em energies} and other properties, such as
 267 {\em total pressure}, {\em pressure tensor}, {\em density}, {\em
 268 box-volume} and {\em box-sizes}, use the program {\tt \normindex{gmx energy}}.  A
 269 choice can be made from a list a set of energies, like potential,
 270 kinetic or total energy, or individual contributions, like
 271 Lennard-Jones or dihedral energies.
 272
 273 The {\em center-of-mass velocity}, defined as
 274 \beq
 275 {\bf v}_{com} = {1 \over M} \sum_{i=1}^N m_i {\bf v}_i
 276 \eeq
 277 with $M = \sum_{i=1}^N m_i$ the total mass of the system, can be
 278 monitored in time by the program {\tt \normindex{gmx traj} -com -ov}. It is however
 279 recommended to remove the center-of-mass velocity every step (see
 280 \chref{algorithms})!
 281
 282 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Radial distribution functions
 283
 284 \section{Radial distribution functions}
 285 \label{sec:rdf}
 286 {\tt gmx rdf}\\
 287 The {\em radial distribution function} (RDF) or pair correlation
 288 function $g_{AB}(r)$ between particles of type $A$ and $B$ is defined
 289 in the following way:
 290 \newcommand{\dfrac}[2]{\displaystyle \frac{#1}{#2}}
 291 \beq
 292 \begin{array}{rcl}
 293 g_{AB}(r)&=&    \dfrac{\langle \rho_B(r) \rangle}{\langle\rho_B\rangle_{local}}         \\
 294          &=&    \dfrac{1}{\langle\rho_B\rangle_{local}}\dfrac{1}{N_A}
 295                 \sum_{i \in A}^{N_A} \sum_{j \in B}^{N_B}
 296                 \dfrac{\delta( r_{ij} - r )}{4 \pi r^2}         \\
 297 \end{array}
 298 \eeq
 299 with $\langle\rho_B(r)\rangle$ the particle density of type $B$ at a distance $r$
 300 around particles $A$, and $\langle\rho_B\rangle_{local}$ the particle density of
 301 type $B$ averaged over all spheres around particles $A$ with radius
 302 $r_{max}$ (see \figref{rdfex}C).
 303
 304 \begin{figure}
 305 \centerline{
 306 {\includegraphics[width=7cm,angle=270]{plots/rdf}}}
 307 \caption[Definition of slices in {\tt gmx rdf}.]{Definition of slices
 308 in {\tt gmx rdf}: A. $g_{AB}(r)$. B. $g_{AB}(r,\theta)$. The slices are
 309 colored gray. C. Normalization $\langle\rho_B\rangle_{local}$. D. Normalization
 310 $\langle\rho_B\rangle_{local,\:\theta }$. Normalization volumes are colored gray.}
 311 \label{fig:rdfex}
 312 \end{figure}
 313
 314 Usually the value of $r_{max}$ is half of the box length.  The
 315 averaging is also performed in time.  In practice the analysis program
 316 {\tt \normindex{gmx rdf}} divides the system into spherical slices (from $r$ to
 317 $r+dr$, see \figref{rdfex}A) and makes a histogram in stead of
 318 the $\delta$-function. An example of the RDF of oxygen-oxygen in
 319 SPC water~\cite{Berendsen81} is given in \figref{rdf}.
 320
 321 \begin{figure}
 322 \centerline{
 323 {\includegraphics[width=8cm]{plots/rdfO-O}}}
 324 \caption{$g_{OO}(r)$ for Oxygen-Oxygen of SPC-water.}
 325 \label{fig:rdf}
 326 \end{figure}
 327
 328 % TODO: This functionality isn't there...
 329 With {\tt gmx rdf} it is also possible to calculate an angle dependent rdf
 330 $g_{AB}(r,\theta)$, where the angle $\theta$ is defined with respect to a
 331 certain laboratory axis ${\bf e}$, see \figref{rdfex}B.
 332 \bea
 333 g_{AB}(r,\theta) &=& {1 \over \langle\rho_B\rangle_{local,\:\theta }} {1 \over N_A} \sum_{i \in A}^{N_A} \sum_{j \in B}^{N_B} {\delta( r_{ij} - r ) \delta(\theta_{ij} -\theta) \over 2 \pi r^2 sin(\theta)}\\
 334 cos(\theta_{ij}) &=& {{\bf r}_{ij} \cdot {\bf e} \over \|r_{ij}\| \;\| e\| }
 335 \eea
 336 This $g_{AB}(r,\theta)$ is useful for analyzing anisotropic systems.
 337 {\bf Note} that in this case the normalization $\langle\rho_B\rangle_{local,\:\theta}$ is
 338 the average density in all angle slices from $\theta$ to $\theta + d\theta$
 339 up to $r_{max}$, so angle dependent, see \figref{rdfex}D.
 340
 341 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Correlation functions
 342 %\ifthenelse{\equal{\gmxlite}{1}}{}{
 343
 344 \section{Correlation functions}
 345 \label{sec:corr}
 346
 347 \subsection{Theory of correlation functions}
 348 The theory of correlation functions is well established~\cite{Allen87}.
 349 We describe here the implementation of the various
 350 \normindex{correlation} function flavors in the {\gromacs} code.
 351 The definition of the \index{autocorrelation function} (ACF)
 352 $C_f(t)$ for a property $f(t)$ is:
 353 \beq
 354 C_f(t)  ~=~     \left\langle f(\xi) f(\xi+t)\right\rangle_{\xi}
 355 \label{eqn:corr}
 356 \eeq
 357 where the notation on the right hand side indicates averaging over $\xi$, {\ie} over
 358 time origins.
 359 It is also possible to compute cross-correlation function from two properties
 360 $f(t)$ and $g(t)$:
 361 \beq
 362 C_{fg}(t) ~=~   \left\langle f(\xi) g(\xi+t)\right\rangle_{\xi}
 363 \eeq
 364 however, in {\gromacs} there is no standard mechanism to do this
 365 ({\bf note:} you can use the {\tt \normindex{xmgr}} program to compute cross correlations).
 366 The integral of the correlation function over time is the
 367 correlation time $\tau_f$:
 368 \beq
 369 \tau_f  ~=~     \int_0^{\infty} C_f(t) {\rm d} t
 370 \label{eqn:corrtime}
 371 \eeq
 372
 373 In practice, correlation functions are calculated based on data points with
 374 discrete time intervals {$\Delta$t}, so that the ACF from an MD simulation is:
 375 \beq
 376 C_f(j\Delta t)  ~=~     \frac{1}{N-j}\sum_{i=0}^{N-1-j} f(i\Delta t) f((i+j)\Delta t)
 377 \label{eqn:corrmd}
 378 \eeq
 379 where $N$ is the number of available time frames for the calculation.
 380 The resulting ACF is
 381 obviously only available at time points with the same interval {$\Delta$t}.
 382 Since, for many applications, it is necessary to know the short time behavior
 383 of the ACF ({\eg} the first 10 ps) this often means that we have to save the
 384 data with intervals much shorter than the time scale of interest.
 385 Another implication of \eqnref{corrmd} is that in principle we can not compute
 386 all points of the ACF with the same accuracy, since we have $N-1$ data points
 387 for $C_f(\Delta t)$ but only 1 for $C_f((N-1)\Delta t)$. However, if we decide to
 388 compute only an ACF of length $M\Delta t$, where $M \leq N/2$ we can compute
 389 all points with the same statistical accuracy:
 390 \beq
 391 C_f(j\Delta t)  ~=~ \frac{1}{M}\sum_{i=0}^{N-1-M} f(i\Delta t)f((i+j)\Delta t)
 392 \eeq
 393 Here of course $j < M$.
 394 $M$ is sometimes referred to as the \normindex{time lag} of the correlation function.
 395 When we decide to do this, we intentionally do not use all the available points
 396 for very short time intervals ($j << M$), but it makes it easier to interpret
 397 the results.
 398 Another aspect that may not be neglected when computing
 399 ACFs from simulation is that usually the time origins $\xi$ (\eqnref{corr})
 400 are not statistically independent, which may introduce a bias in the results.
 401 This can be tested using a block-averaging procedure, where only time origins
 402 with a spacing at least the length of the time lag are included, {\eg} using
 403 $k$ time origins with spacing of $M\Delta t$ (where $kM \leq N$):
 404 \beq
 405 C_f(j\Delta t)  ~=~ \frac{1}{k}\sum_{i=0}^{k-1} f(iM\Delta t)f((iM+j)\Delta t)
 406 \eeq
 407 However, one
 408 needs very long simulations to get good accuracy this way, because there are
 409 many fewer points that contribute to the ACF.
 410
 411 \subsection{Using FFT for computation of the ACF}
 412 The computational cost for calculating an ACF according to \eqnref{corrmd}
 413 is proportional to $N^2$, which is considerable. However, this can be improved
 414 by using fast Fourier transforms to do the convolution~\cite{Allen87}.
 415
 416 \subsection{Special forms of the ACF}
 417 There are some important varieties on the ACF, {\eg} the ACF of a vector \ve{p}:
 418 \beq
 419 C_{\ve{p}}(t) ~=~       \int_0^{\infty} P_n(\cos\angle\left(\ve{p}(\xi),\ve{p}(\xi+t)\right) {\rm d} \xi
 420 \label{eqn:corrleg}
 421 \eeq
 422 where $P_n(x)$ is the $n^{th}$ order Legendre polynomial
 423 \footnote{$P_0(x) = 1$, $P_1(x) = x$, $P_2(x) = (3x^2-1)/2$}.
 424 Such correlation times
 425 can actually be obtained experimentally using {\eg} NMR or other relaxation
 426 experiments. {\gromacs} can compute correlations using
 427 the 1$^{st}$ and 2$^{nd}$ order Legendre polynomial (\eqnref{corrleg}).
 428 This can also be used for rotational autocorrelation
 429 ({\tt \normindex{gmx rotacf}})
 430 and dipole autocorrelation ({\tt \normindex{gmx dipoles}}).
 431
 432 In order to study torsion angle dynamics, we define a dihedral
 433 autocorrelation function as~\cite{Spoel97a}:
 434 \beq
 435 C(t)    ~=~     \left\langle \cos(\theta(\tau)-\theta(\tau+t))\right\rangle_{\tau}
 436 \label{eqn:coenk}
 437 \eeq
 438 {\bf Note} that this is not a  product of two functions
 439 as is generally used for correlation
 440 functions, but it may be rewritten as the sum of two products:
 441 \beq
 442 C(t)    ~=~     \left\langle\cos(\theta(\tau))\cos(\theta(\tau+t))\,+\,\sin(\theta(\tau))\sin(\theta(\tau+t))\right\rangle_{\tau}
 443 \label{eqn:cot}
 444 \eeq
 445
 446 \subsection{Some Applications}
 447 The program {\tt \normindex{gmx velacc}} calculates the {\em velocity autocorrelation
 448 function}.
 449 \beq
 450 C_{\ve{v}} (\tau) ~=~ \langle {\ve{v}}_i(\tau) \cdot {\ve{v}}_i(0) \rangle_{i \in A}
 451 \eeq
 452 The self diffusion coefficient can be calculated using the Green-Kubo
 453 relation~\cite{Allen87}:
 454 \beq
 455 D_A ~=~ {1\over 3} \int_0^{\infty} \langle {\bf v}_i(t) \cdot {\bf v}_i(0) \rangle_{i \in A} \; dt
 456 \eeq
 457 which is just the integral of the velocity autocorrelation function.
 458 There is a widely-held belief that the velocity ACF converges faster than the mean
 459 square displacement (\secref{msd}), which can also be used for the computation of
 460 diffusion constants. However, Allen \& Tildesley~\cite{Allen87}
 461 warn us that the long-time
 462 contribution to the velocity ACF can not be ignored, so care must be taken.
 463
 464 Another important quantity is the dipole correlation time. The {\em dipole
 465 correlation function} for particles of type $A$ is calculated as follows by
 466 {\tt \normindex{gmx dipoles}}:
 467 \beq
 468 C_{\mu} (\tau) ~=~
 469 \langle {\bf \mu}_i(\tau) \cdot {\bf \mu}_i(0) \rangle_{i \in A}
 470 \eeq
 471 with ${\bf \mu}_i = \sum_{j \in i} {\bf r}_j q_j$. The dipole correlation time
 472 can be computed using \eqnref{corrtime}.
 473 For some applications see~\cite{Spoel98a}.
 474
 475 The \normindex{viscosity} of a liquid can be related to the correlation
 476 time of the Pressure tensor $\ve{P}$~\cite{PSmith93c,Balasubramanian96}.
 477 {\tt \normindex{gmx energy}} can compute the viscosity,
 478 but this is not very accurate~\cite{Hess2002a}, and
 479 actually the values do not converge.
 480 %} % Brace matches ifthenelse test for gmxlite
 481
 482 \section{Mean Square Displacement}
 483 \label{sec:msd}
 484 {\tt gmx msd}\\
 485 To determine the self \index{diffusion coefficient} $D_A$ of
 486 particles of type $A$, one can use the \normindex{Einstein
 487 relation}~\cite{Allen87}:
 488 \beq
 489 \lim_{t \rightarrow \infty} \langle
 490 \|{\bf r}_i(t) - {\bf r}_i(0)\|^2 \rangle_{i \in A} ~=~ 6 D_A t
 491 \eeq
 492 This {\em mean square displacement} and $D_A$ are calculated by the
 493 program {\tt \normindex{gmx msd}}. Normally an index file containing
 494 atom numbers is used and the MSD is averaged over these atoms.  For
 495 molecules consisting of more than one atom, ${\bf r}_i$ can be taken
 496 as the center of mass positions of the molecules. In that case, you
 497 should use an index file with molecule numbers. The results will be
 498 nearly identical to averaging over atoms, however. The {\tt gmx msd}
 499 program can
 500 also be used for calculating diffusion in one or two dimensions. This
 501 is useful for studying lateral diffusion on interfaces.
 502
 503 An example of the mean square displacement of SPC water is given in
 504 \figref{msdwater}.
 505
 506 \begin{figure}
 507 \centerline{
 508 {\includegraphics[width=8cm]{plots/msdwater}}}
 509 \caption{Mean Square Displacement of SPC-water.}
 510 \label{fig:msdwater}
 511 \end{figure}
 512
 513 %\ifthenelse{\equal{\gmxlite}{1}}{}{
 514 %
 515 %%%%%%%%%%%%%%%%%%%%% Bonds, angles and dihedral %%%%%%%%%%%%%%%%%%%
 516 %
 517 \section{Bonds/distances, angles and dihedrals}
 518 \label{sec:bad}
 519 {\tt gmx distance, gmx angle, gmx gangle}\\
 520 To monitor specific {\em bonds} in your modules, or more generally
 521 distances between points, the program
 522 {\tt \normindex{gmx distance}} can calculate distances as a function of
 523 time, as well as the distribution of the distance.
 524 With a traditional index file, the groups should consist of pairs of
 525 atom numbers, for example:
 526 \begin{verbatim}
 527 [ bonds_1 ]
 528  1     2
 529  3     4
 530  9    10
 531
 532 [ bonds_2 ]
 533 12    13
 534 \end{verbatim}
 535 Selections are also supported, with first two positions defining the
 536 first distance, second pair of positions defining the second distance
 537 and so on.  You can calculate the distances between CA and CB atoms in
 538 all your residues (assuming that every residue either has both atoms, or
 539 neither) using a selection such as:
 540 \begin{verbatim}
 541 name CA CB
 542 \end{verbatim}
 543 The selections also allow more generic distances to be computed.
 544 For example, to compute the distances between centers of mass of two
 545 residues, you can use:
 546 \begin{verbatim}
 547 com of resname AAA plus com of resname BBB
 548 \end{verbatim}
 549
 550 The program {\tt \normindex{gmx angle}} calculates the distribution of {\em angles} and
 551 {\em dihedrals} in time. It also gives the average angle or dihedral.
 552 The index file consists of triplets or quadruples of atom numbers:
 553
 554 \begin{verbatim}
 555 [ angles ]
 556  1     2     3
 557  2     3     4
 558  3     4     5
 559
 560 [ dihedrals ]
 561  1     2     3     4
 562  2     3     5     5
 563 \end{verbatim}
 564
 565 For the dihedral angles you can use either the ``biochemical convention''
 566 ($\phi = 0 \equiv cis$) or ``polymer convention'' ($\phi = 0 \equiv trans$),
 567 see \figref{dih_def}.
 568
 569 \begin{figure}
 570 \centerline{
 571 {\includegraphics[width=3.5cm,angle=270]{plots/dih-def}}}
 572 \caption[Dihedral conventions.]{Dihedral conventions: A. ``Biochemical
 573 convention''. B. ``Polymer convention''.}
 574 \label{fig:dih_def}
 575 \end{figure}
 576
 577 The program {\tt \normindex{gmx gangle}} provides a selection-enabled
 578 version to compute angles.  This tool can also compute angles and
 579 dihedrals, but does not support all the options of {\tt gmx angle}, such
 580 as autocorrelation or other time series analyses.
 581 In addition, it supports angles between two vectors, a vector and a
 582 plane, two planes (defined by 2 or 3 points, respectively), a
 583 vector/plane and the $z$ axis, or a vector/plane and the normal of a
 584 sphere (determined by a single position).
 585 Also the angle between a vector/plane compared to its position in the
 586 first frame is supported.
 587 For planes, {\tt \normindex{gmx gangle}} uses the normal vector
 588 perpendicular to the plane.
 589 See \figref{sgangle}A, B, C) for the definitions.
 590
 591 \begin{figure}
 592 \centerline{
 593 {\includegraphics{plots/sgangle}}}
 594 \caption[Angle options of {\tt gmx gangle}.]{Angle options of {\tt gmx gangle}:
 595 A. Angle between two vectors.
 596 B. Angle between two planes.
 597 C. Angle between a vector and the $z$ axis.
 598 D. Angle between a vector and the normal of a sphere.
 599 Also other combinations are supported: planes and vectors can be used
 600 interchangeably.}
 601 \label{fig:sgangle}
 602 \end{figure}
 603 %} % Brace matches ifthenelse test for gmxlite
 604
 605 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Radius of gyration and distances
 606
 607 \section{Radius of gyration and distances}
 608 \label{sec:rg}
 609 {\tt gmx gyrate, gmx distance, gmx mindist, gmx mdmat, gmx pairdist, gmx xpm2ps}\\
 610 To have a rough measure for the compactness of a structure, you can calculate
 611 the {\em radius of gyration} with the program {\tt \normindex{gmx gyrate}} as follows:
 612 \beq
 613 R_g ~=~ \left({\frac{\sum_i \|{\bf r}_i\|^2 m_i}{\sum_i m_i}}\right)^{\half}
 614 \label{eqn:rg}
 615 \eeq
 616 where $m_i$ is the mass of atom $i$ and ${\bf r}_i$ the position of
 617 atom $i$ with respect to the center of mass of the molecule. It is especially
 618 useful to characterize polymer solutions and proteins.
 619
 620 Sometimes it is interesting to plot the {\em distance} between two atoms,
 621 or the {\em minimum} distance between two groups of atoms
 622 ({\eg}: protein side-chains in a salt bridge).
 623 To calculate these distances between certain groups there are several
 624 possibilities:
 625 \begin{description}
 626 \item[$\bullet$]
 627 The {\em distance between the geometrical centers} of two groups can be
 628 calculated with the program
 629 %\ifthenelse{\equal{\gmxlite}{1}}{
 630 {{\tt \normindex{gmx distance}}, as explained in \secref{bad}.}
 631 \item[$\bullet$]
 632 The {\em minimum distance} between two groups of atoms during time
 633 can be calculated with the program {\tt \normindex{gmx mindist}}. It also calculates the
 634 {\em number of contacts} between these groups
 635 within a certain radius $r_{max}$.
 636 \item[$\bullet$]
 637 {\tt \normindex{gmx pairdist}} is a selection-enabled version of
 638 {\tt gmx mindist}.
 639 \item[$\bullet$]
 640 To monitor the {\em minimum distances between amino acid residues}
 641 within a (protein) molecule, you can use
 642 the program {\tt \normindex{gmx mdmat}}. This minimum distance between two residues
 643 A$_i$ and A$_j$ is defined as the smallest distance between any pair of
 644 atoms (i $\in$ A$_i$, j $\in$ A$_j$).
 645 The output is a symmetrical matrix of smallest distances
 646 between all residues.
 647 To visualize this matrix, you can use a program such as {\tt xv}.
 648 If you want to view the axes and legend or if you want to print
 649 the matrix, you can convert it with
 650 {\tt xpm2ps} into a Postscript picture, see \figref{mdmat}.
 651 \begin{figure}
 652 \centerline{
 653 \includegraphics[width=6.5cm]{plots/distm}}
 654 \caption{A minimum distance matrix for a peptide~\protect\cite{Spoel96b}.}
 655 \label{fig:mdmat}
 656 \end{figure}
 657
 658 Plotting these matrices for different time-frames, one can analyze changes
 659 in the structure, and {\eg} forming of salt bridges.
 660 \end{description}
 661
 662 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Root mean square deviations
 663
 664 \section{Root mean square deviations in structure}
 665 \label{sec:rmsd}
 666 {\tt gmx rms, gmx rmsdist}\\
 667 The {\em root mean square deviation} ($RMSD$) of certain atoms in a molecule
 668 with respect to a reference structure can be calculated with the program
 669 {\tt \normindex{gmx rms}} by least-square fitting the structure to the reference structure
 670 ($t_2 = 0$) and subsequently calculating the $RMSD$ (\eqnref{rmsd}).
 671 \beq
 672 RMSD(t_1,t_2) ~=~ \left[\frac{1}{M} \sum_{i=1}^N m_i \|{\bf r}_i(t_1)-{\bf r}_i(t_2)\|^2 \right]^{\frac{1}{2}}
 673 \label{eqn:rmsd}
 674 \eeq
 675 where $M = \sum_{i=1}^N m_i$ and ${\bf r}_i(t)$ is the position of atom $i$ at time $t$.
 676 {\bf Note} that fitting does not have to use the same atoms as the calculation
 677 of the $RMSD$; {\eg} a protein is usually fitted on the backbone atoms
 678 (N,C$_{\alpha}$,C), but the $RMSD$ can be computed of the backbone
 679 or of the whole protein.
 680
 681 Instead of comparing the structures to the initial structure at time $t=0$
 682 (so for example a crystal structure), one can also calculate \eqnref{rmsd}
 683 with a structure at time $t_2=t_1-\tau$.
 684 This gives some insight in the mobility as a function of $\tau$.
 685 A matrix can also be made with the $RMSD$ as a function of $t_1$ and $t_2$,
 686 which gives a nice graphical interpretation of a trajectory.
 687 If there are transitions in a trajectory, they will clearly show up in
 688 such a matrix.
 689
 690 Alternatively the $RMSD$ can be computed using a fit-free method with the
 691 program {\tt \normindex{gmx rmsdist}}:
 692 \beq
 693 RMSD(t) ~=~     \left[\frac{1}{N^2}\sum_{i=1}^N \sum_{j=1}^N    \|{\bf r}_{ij}(t)-{\bf r}_{ij}(0)\|^2\right]^{\frac{1}{2}}
 694 \label{eqn:rmsdff}
 695 \eeq
 696 where the {\em distance} {\bf r}$_{ij}$ between atoms at time $t$
 697 is compared with the distance between the same atoms at time $0$.
 698
 699 %\ifthenelse{\equal{\gmxlite}{1}}{}{
 700 \section{Covariance analysis\index{covariance analysis}}
 701 \label{sec:covanal}
 702 Covariance analysis, also called
 703 \seeindex{principal component analysis}{covariance analysis}
 704 or \seeindex{essential dynamics}{covariance analysis}
 705 \cite{Amadei93}{,} can find correlated motions.
 706 It uses the covariance matrix $C$ of the atomic coordinates:
 707 \beq
 708 C_{ij} = \left \langle
 709 M_{ii}^{\frac{1}{2}} (x_i - \langle x_i \rangle)
 710 M_{jj}^{\frac{1}{2}}  (x_j - \langle x_j \rangle)
 711 \right \rangle
 712 \eeq
 713 where $M$ is a diagonal matrix containing the masses of the atoms
 714 (mass-weighted analysis) or the unit matrix (non-mass weighted analysis).
 715 $C$ is a symmetric $3N \times 3N$ matrix, which can be diagonalized with
 716 an orthonormal transformation matrix $R$:
 717 \beq
 718 R^T C R = \mbox{diag}(\lambda_1,\lambda_2,\ldots,\lambda_{3N})
 719 ~~~~\mbox{where}~~\lambda_1 \geq \lambda_2 \geq \ldots \geq \lambda_{3N}
 720 \eeq
 721 The columns of $R$ are the eigenvectors, also called principal or
 722 essential modes.
 723 $R$ defines a transformation to a new coordinate system. The trajectory
 724 can be projected on the principal modes to give the principal components
 725 $p_i(t)$:
 726 \beq
 727 {\bf p}(t) = R^T M^{\frac{1}{2}} ({\bf x}(t) - \langle {\bf x} \rangle)
 728 \eeq
 729 The eigenvalue $\lambda_i$ is the mean square fluctuation of principal
 730 component $i$. The first few principal modes often describe
 731 collective, global motions in the system.
 732 The trajectory can be filtered along one (or more) principal modes.
 733 For one principal mode $i$ this goes as follows:
 734 \beq
 735 {\bf x}^f(t) =
 736 \langle {\bf x} \rangle + M^{-\frac{1}{2}} R_{*i} \, p_i(t)
 737 \eeq
 738
 739 When the analysis is performed on a macromolecule, one often wants to
 740 remove the overall rotation and translation to look at the internal motion
 741 only. This can be achieved by least square fitting to a reference structure.
 742 Care has to be taken that the reference structure is representative for the
 743 ensemble, since the choice of reference structure influences the covariance
 744 matrix.
 745
 746 One should always check if the principal modes are well defined.
 747 If the first principal component resembles a half cosine and
 748 the second resembles a full cosine, you might be filtering noise (see below).
 749 A good way to check the relevance of the first few principal
 750 modes is to calculate the overlap of the sampling between
 751 the first and second half of the simulation.
 752 {\bf Note} that this can only be done when the same reference structure is
 753 used for the two halves.
 754
 755 A good measure for the overlap has been defined in~\cite{Hess2002b}.
 756 The elements of the covariance matrix are proportional to the square
 757 of the displacement, so we need to take the square root of the matrix
 758 to examine the extent of sampling. The square root can be
 759 calculated from the eigenvalues $\lambda_i$ and the eigenvectors,
 760 which are the columns of the rotation matrix $R$.
 761 For a symmetric and diagonally-dominant matrix $A$ of size $3N \times 3N$
 762 the square root can be calculated as:
 763 \beq
 764 A^\frac{1}{2} =
 765 R \, \mbox{diag}(\lambda_1^\frac{1}{2},\lambda_2^\frac{1}{2},\ldots,\lambda_{3N}^\frac{1}{2}) \, R^T
 766 \eeq
 767 It can be verified easily that the product of this matrix with itself gives
 768 $A$.
 769 Now we can define a difference $d$ between covariance matrices $A$ and $B$
 770 as follows:
 771 \begin{eqnarray}
 772 d(A,B) & = & \sqrt{\mbox{tr}\left(\left(A^\frac{1}{2} - B^\frac{1}{2}\right)^2\right)
 773 }
 774 \\ & = &
 775 \sqrt{\mbox{tr}\left(A + B - 2 A^\frac{1}{2} B^\frac{1}{2}\right)}
 776 \\ & = &
 777 \left( \sum_{i=1}^N \left( \lambda_i^A + \lambda_i^B \right)
 778 - 2 \sum_{i=1}^N \sum_{j=1}^N \sqrt{\lambda_i^A \lambda_j^B}
 779 \left(R_i^A \cdot R_j^B\right)^2 \right)^\frac{1}{2}
 780 \end{eqnarray}
 781 where tr is the trace of a matrix.
 782 We can now define the overlap $s$ as:
 783 \beq
 784 s(A,B) = 1 - \frac{d(A,B)}{\sqrt{\mbox{tr}A + \mbox{tr} B}}
 785 \eeq
 786 The overlap is 1 if and only if matrices $A$ and $B$ are identical.
 787 It is 0 when the sampled subspaces are completely orthogonal.
 788
 789 A commonly-used measure is the subspace overlap of the first few
 790 eigenvectors of covariance matrices.
 791 The overlap of the subspace spanned by $m$ orthonormal vectors
 792 ${\bf w}_1,\ldots,{\bf w}_m$ with a reference subspace spanned by
 793 $n$ orthonormal vectors ${\bf v}_1,\ldots,{\bf v}_n$
 794 can be quantified as follows:
 795 \beq
 796 \mbox{overlap}({\bf v},{\bf w}) =
 797 \frac{1}{n} \sum_{i=1}^n \sum_{j=1}^m ({\bf v}_i \cdot {\bf w}_j)^2
 798 \eeq
 799 The overlap will increase with increasing $m$ and will be 1 when
 800 set ${\bf v}$ is a subspace of set ${\bf w}$.
 801 The disadvantage of this method is that it does not take the eigenvalues
 802 into account. All eigenvectors are weighted equally, and when
 803 degenerate subspaces are present (equal eigenvalues), the calculated overlap
 804 will be too low.
 805
 806 Another useful check is the cosine content. It has been proven that the
 807 the principal components of random diffusion are cosines with the number of
 808 periods equal to half the principal component index~\cite{Hess2000,Hess2002b}.
 809 The eigenvalues are proportional to the index to the power $-2$.
 810 The cosine content is defined as:
 811 \beq
 812 \frac{2}{T}
 813 \left( \int_0^T \cos\left(\frac{i \pi t}{T}\right) \, p_i(t) \mbox{d} t \right)^2
 814 \left( \int_0^T p_i^2(t) \mbox{d} t \right)^{-1}
 815 \eeq
 816 When the cosine content of the first few principal components
 817 is close to 1, the largest fluctuations are not connected with
 818 the potential, but with random diffusion.
 819
 820 The covariance matrix is built and diagonalized by
 821 {\tt \normindex{gmx covar}}.
 822 The principal components and overlap (and many more things)
 823 can be plotted and analyzed with {\tt \normindex{gmx anaeig}}.
 824 The cosine content can be calculated with {\tt \normindex{gmx analyze}}.
 825 %} % Brace matches ifthenelse test for gmxlite
 826
 827 %\ifthenelse{\equal{\gmxlite}{1}}{}{
 828 \section{Dihedral principal component analysis}
 829 {\tt gmx angle, gmx covar, gmx anaeig}\\
 830 Principal component analysis can be performed in dihedral
 831 space~\cite{Mu2005a} using {\gromacs}. You start by defining the
 832 dihedral angles of interest in an index file, either using {\tt
 833  gmx mk_angndx} or otherwise. Then you use the {\tt gmx angle} program
 834 with the {\tt -or} flag to produce a new {\tt .trr} file containing the cosine and
 835 sine of each dihedral angle in two coordinates, respectively. That is,
 836 in the {\tt .trr} file you will have a series of numbers corresponding to:
 837 cos($\phi_1$), sin($\phi_1$), cos($\phi_2$), sin($\phi_2$), ...,
 838 cos($\phi_n$), sin($\phi_n$), and the array is padded with zeros, if
 839 necessary.  Then you can use this {\tt .trr} file as input for the {\tt
 840  gmx covar} program and perform principal component analysis as usual.
 841 For this to work you will need to generate a reference file ({\tt .tpr},
 842 {\tt .gro}, {\tt .pdb} etc.) containing the same number of ``atoms''
 843 as the new {\tt .trr} file, that is for $n$ dihedrals you need 2$n$/3 atoms
 844 (rounded up if not an integer number).
 845 You should use the {\tt -nofit} option for {\tt
 846 gmx covar} since the coordinates in the dummy reference file do not
 847 correspond in any way to the information in the {\tt .trr} file. Analysis of
 848 the results is done using {\tt gmx anaeig}.
 849 %} % Brace matches ifthenelse test for gmxlite
 850
 851 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Hydrogen bonds
 852
 853 \section{Hydrogen bonds}
 854 {\tt gmx hbond}\\
 855 The program {\tt \normindex{gmx hbond}} analyzes the {\em hydrogen bonds} (H-bonds)
 856 between all possible donors D and acceptors A. To determine if an
 857 H-bond exists, a geometrical criterion is used, see also
 858 \figref{hbond}:
 859 \beq
 860 \begin{array}{rclcl}
 861 r       & \leq  & r_{HB}        & = & 0.35~\mbox{nm}    \\
 862 \alpha  & \leq  & \alpha_{HB}   & = & 30^o              \\
 863 \end{array}
 864 \eeq
 865
 866 \begin{figure}
 867 \centerline{\includegraphics[width=2.5cm,angle=270]{plots/hbond}}
 868 \caption{Geometrical Hydrogen bond criterion.}
 869 \label{fig:hbond}
 870 \end{figure}
 871
 872 The value of $r_{HB} = 0.35$~nm corresponds to the first minimum of the RDF of
 873 SPC water (see also \figref{rdf}).
 874
 875 The program {\tt gmx hbond} analyzes all hydrogen bonds existing
 876 between two groups of atoms (which must be either identical or
 877 non-overlapping) or in specified donor-hydrogen-acceptor triplets, in
 878 the following ways:
 879
 880 \begin{figure}
 881 \centerline{
 882 {\includegraphics[width=5cm,angle=270]{plots/hbond-insert}}}
 883 \caption[Insertion of water into an H-bond.]{Insertion of water into
 884 an H-bond. (1) Normal H-bond between two residues. (2) H-bonding
 885 bridge via a water molecule.}
 886 \label{fig:insert}
 887 \end{figure}
 888
 889 \begin{itemize}
 890 \item
 891 Donor-Acceptor distance ($r$) distribution of all H-bonds
 892 \item
 893 Hydrogen-Donor-Acceptor angle ($\alpha$) distribution of all H-bonds
 894 \item
 895 The total number of H-bonds in each time frame
 896 \item
 897 \newcommand{\nn}[1]{$n$-$n$+$#1$}
 898 The number of H-bonds in time between residues, divided into groups
 899 \nn{i} where $n$ and $n$+$i$ stand for residue numbers and $i$ goes
 900 from 0 to 6. The group for $i=6$ also includes all H-bonds for
 901 $i>6$. These groups include the \nn{3}, \nn{4} and \nn{5} H-bonds,
 902 which provide a measure for the formation of $\alpha$-helices or
 903 $\beta$-turns or strands.
 904 \item
 905 The lifetime of the H-bonds is calculated from the average over all
 906 autocorrelation functions of the existence functions (either 0 or 1)
 907 of all H-bonds:
 908 \beq
 909 C(\tau) ~=~ \langle s_i(t)~s_i (t + \tau) \rangle
 910 \label{eqn:hbcorr}
 911 \eeq
 912 with $s_i(t) = \{0,1\}$ for H-bond $i$ at time $t$. The integral of
 913 $C(\tau)$ gives a rough estimate of the average H-bond lifetime
 914 $\tau_{HB}$:
 915 \beq
 916 \tau_{HB} ~=~ \int_{0}^{\infty} C(\tau) d\tau
 917 \label{eqn:hblife}
 918 \eeq
 919 Both the integral and the complete autocorrelation function $C(\tau)$
 920 will be output, so that more sophisticated analysis ({\eg}\@ using
 921 multi-exponential fits) can be used to get better estimates for
 922 $\tau_{HB}$. A more complete analysis is given in ref.~\cite{Spoel2006b};
 923 one of the more fancy option is the Luzar and Chandler analysis
 924 of hydrogen bond kinetics~\cite{Luzar96b,Luzar2000a}.
 925 \item
 926 An H-bond existence map can be generated of dimensions {\em
 927 \#~H-bonds}$\times${\em \#~frames}. The ordering is identical to the index
 928 file (see below), but reversed, meaning that the last triplet in the index
 929 file corresponds to the first row of the existence map.
 930 \item
 931 Index groups are output containing the analyzed groups, all
 932 donor-hydrogen atom pairs and acceptor atoms in these groups,
 933 donor-hydrogen-acceptor triplets involved in hydrogen bonds between
 934 the analyzed groups and all solvent atoms involved in insertion.
 935
 936 \end{itemize}
 937
 938 %\ifthenelse{\equal{\gmxlite}{1}}{}{
 939 %
 940 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Protein related items
 941
 942 \section{Protein-related items}
 943 {\tt gmx do_dssp, gmx rama, gmx wheel}\\
 944 To analyze structural changes of a protein, you can calculate the radius of
 945 gyration or the minimum residue distances over time
 946 (see \secref{rg}), or calculate the RMSD (\secref{rmsd}).
 947
 948 You can also look at the changing of {\em secondary structure elements}
 949 during your run. For this, you can use the program {\tt \normindex{gmx do_dssp}}, which is
 950 an interface for the commercial program {\tt DSSP}~\cite{Kabsch83}. For
 951 further information, see the {\tt DSSP} manual. A typical output plot of
 952 {\tt gmx do_dssp} is given in \figref{dssp}.
 953
 954 \begin{figure}
 955 \centerline{
 956 \includegraphics[width=12cm]{plots/dssp}}
 957 \caption{Analysis of the secondary structure elements of a peptide in time.}
 958 \label{fig:dssp}
 959 \end{figure}
 960
 961 One other important analysis of proteins is the so-called
 962 {\em Ramachandran plot}.
 963 This is the projection of the structure on the two dihedral angles $\phi$ and
 964 $\psi$ of the protein backbone, see \figref{phipsi}.
 965
 966 \begin{figure}
 967 \centerline{
 968 \includegraphics[width=5cm]{plots/phipsi}}
 969 \caption{Definition of the dihedral angles $\phi$ and $\psi$ of the protein backbone.}
 970 \label{fig:phipsi}
 971 \end{figure}
 972
 973 To evaluate this Ramachandran plot you can use the program {\tt
 974 \normindex{gmx rama}}.
 975 A typical output is given in \figref{rama}.
 976
 977 \begin{figure}
 978 \centerline{
 979 {\includegraphics[width=8cm]{plots/rama}}}
 980 \caption{Ramachandran plot of a small protein.}
 981 \label{fig:rama}
 982 \end{figure}
 983
 984 When studying $\alpha$-helices
 985 it is useful to have a {\em helical wheel} projection
 986 of your peptide, to see whether a peptide is amphipathic. This can be done
 987 using the {\tt \normindex{gmx wheel}} program. Two examples are
 988 plotted in \figref{wheel}.
 989
 990 \begin{figure}
 991 \centerline{\includegraphics[width=\htw]{plots/hpr-wheel}}
 992 \caption{Helical wheel projection of the N-terminal helix of HPr.}
 993 \label{fig:wheel}
 994 \end{figure}
 995
 996 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Membrane Related Items
 997
 998 \section{Interface-related items}
 999 {\tt gmx order, gmx density, gmx potential, gmx traj}\\
1000 When simulating molecules with long carbon tails, it can be
1001 interesting to calculate their average orientation. There are several
1002 flavors of order parameters, most of which are related. The program
1003 {\tt \normindex{gmx order}} can calculate order parameters using the equation:
1004
1005 \begin{equation}
1006 S_{z} = \frac{3}{2}\langle {\cos^2{\theta_z}} \rangle - \frac{1}{2}
1007 \label{eqn:Sgr}
1008 \end{equation}
1009 where $\theta_z$ is the angle between the $z$-axis of the simulation
1010 box and the molecular axis under consideration. The latter is defined as the
1011 vector from C$_{n-1}$ to C$_{n+1}$. The parameters $S_x$
1012 and $S_y$ are defined in the same way. The brackets imply averaging over time
1013 and molecules. Order parameters can vary between 1 (full order along
1014 the interface normal) and $-1/2$ (full order perpendicular to the
1015 normal), with a value of zero in the case of isotropic orientation.
1016
1017 The program can do two things for you. It can calculate the order
1018 parameter for each CH$_2$ segment separately, for any of three axes,
1019 or it can divide the box in slices and calculate the average value of
1020 the order parameter per segment in one slice. The first method gives
1021 an idea of the ordering of a molecule from head to tail, the second
1022 method gives an idea of the ordering as function of the box length.
1023
1024 The electrostatic potential ($\psi$) across the interface can be
1025 computed from a trajectory by evaluating the double integral of the
1026 charge density ($\rho(z)$):
1027 \beq
1028 \psi(z) - \psi(-\infty) = - \int_{-\infty}^z dz' \int_{-\infty}^{z'} \rho(z'')dz''/ \epsilon_0
1029 \label{eqn:elpotgr}
1030 \eeq
1031 where the position $z=-\infty$ is far enough in the bulk phase such that
1032 the field is zero.  With this method, it is possible to ``split'' the
1033 total potential into separate contributions from lipid and water
1034 molecules. The program {\tt \normindex{gmx potential}} divides the box in slices and
1035 sums all charges of the atoms in each slice. It then integrates this
1036 charge density to give the electric field, which is in turn integrated to
1037 give the potential. Charge density, electric field, and potential are written
1038 to {\tt xvgr} input files.
1039
1040 The program {\tt \normindex{gmx traj}} is a very simple analysis program. All it
1041 does is print the coordinates, velocities, or forces of selected atoms.
1042 It can also calculate the center of mass of one or more
1043 molecules and print the coordinates of the center of mass to three
1044 files. By itself, this is probably not a very useful analysis, but
1045 having the coordinates of selected molecules or atoms can be very
1046 handy for further analysis, not only in interfacial systems.
1047
1048 The program {\tt \normindex{gmx density}} calculates the mass density of
1049 groups and gives a plot of the density against a box
1050 axis. This is useful for looking at the distribution of groups or
1051 atoms across the interface.
1052
1053 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Chemical shifts
1054
1055 % \section{Chemical shifts}
1056 % {\tt total, do_shift}\\
1057 % You can compute the NMR chemical shifts of protons with the program
1058 % {\tt \normindex{do_shift}}. This is just an {\gromacs} interface to
1059 % the public domain program {\tt total}~\cite{Williamson93a}. For
1060 % further information, read the article. Although there is limited
1061 % support for this in {\gromacs}, users are encouraged to use the
1062 % software provided by David Case's group at Scripps because it seems
1063 % to be more up-to-date.
1064
1065 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1066 %} % Brace matches ifthenelse test for gmxlite
1067
1068 % LocalWords:  Xmgr ndx mk angndx rdf dihedrals grompp hydrogens MainChain Prot
1069 % LocalWords:  oxygens SideChain tryptophan vsitetop aminoacids dat ngmx dr SPC
1070 % LocalWords:  OO ACF fg xmgr corrmd ACFs kM iM FFT th NMR nd corrleg rotacf dt
1071 % LocalWords:  velacc Kubo msd Tildesley corrtime msdwater sgangle cis dih xpm
1072 % LocalWords:  mindist mdmat rms rmsdist rmsd jj diag eigenvectors covar anaeig
1073 % LocalWords:  macromolecule trr tpr gro pdb nofit hbond rclcl HB multi Luzar
1074 % LocalWords:  dssp rama xrama rg Ramachandran phipsi amphipathic HPr xvgr pvd
1075 % LocalWords:  Scripps RDF online usinggroups mdrun groupconcept HOH
1076 % LocalWords:  residuetypes determinding OpenGL traj ov rcl ij ps nm
1077 % LocalWords:  autocorrelation interfacial pairdist