docs/reference-manual/details.rst

   1 Some implementation details
   2 ===========================
   3
   4 In this chapter we will present some implementation details. This is far
   5 from complete, but we deemed it necessary to clarify some things that
   6 would otherwise be hard to understand.
   7
   8 Single Sum Virial in |Gromacs|
   9 ------------------------------
  10
  11 The virial :math:`\Xi` can be written in full tensor form as:
  12
  13 .. math:: \Xi~=~-\frac{1}{2}~\sum_{i < j}^N~\mathbf{r}_ij\otimes\mathbf{F}_{ij}
  14
  15 where :math:`\otimes` denotes the *direct product* of two vectors. [1]_
  16 When this is computed in the inner loop of an MD program 9
  17 multiplications and 9 additions are needed. [2]_
  18
  19 Here it is shown how it is possible to extract the virial calculation
  20 from the inner loop \ :ref:`177 <refBekker93b>`.
  21
  22 Virial
  23 ~~~~~~
  24
  25 In a system with periodic boundary conditions, the periodicity must be
  26 taken into account for the virial:
  27
  28 .. math:: \Xi~=~-\frac{1}{2}~\sum_{i < j}^{N}~\mathbf{r}_{ij}^n\otimes\mathbf{F}_{ij}
  29
  30 where :math:`\mathbf{r}_{ij}^n` denotes the distance
  31 vector of the *nearest image* of atom :math:`i` from atom :math:`j`. In
  32 this definition we add a *shift vector* :math:`\delta_i` to the position
  33 vector :math:`\mathbf{r}_i` of atom :math:`i`. The
  34 difference vector :math:`\mathbf{r}_{ij}^n` is thus equal
  35 to:
  36
  37 .. math:: \mathbf{r}_{ij}^n~=~\mathbf{r}_i+\delta_i-\mathbf{r}_j
  38
  39 or in shorthand:
  40
  41 .. math:: \mathbf{r}_{ij}^n~=~\mathbf{r}_i^n-\mathbf{r}_j
  42
  43 In a triclinic system, there are 27 possible images of :math:`i`; when
  44 a truncated octahedron is used, there are 15 possible images.
  45
  46 Virial from non-bonded forces
  47 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  48
  49 Here the derivation for the single sum virial in the *non-bonded force*
  50 routine is given. There are a couple of considerations that are special
  51 to |Gromacs| that we take into account:
  52
  53 -  When calculating short-range interactions, we apply the *minimum
  54    image convention* and only consider the closest image of each
  55    neighbor - and in particular we never allow interactions between a
  56    particle and any of its periodic images. For all the equations below,
  57    this means :math:`i \neq j`.
  58
  59 -  In general, either the :math:`i` or :math:`j` particle might be
  60    shifted to a neighbor cell to get the closest interaction (shift
  61    :math:`\delta_{ij}`). However, with minimum image convention there
  62    can be at most 27 different shifts for particles in the central cell,
  63    and for typical (very short-ranged) biomolecular interactions there
  64    are typically only a few different shifts involved for each particle,
  65    not to mention that each interaction can only be present for one
  66    shift.
  67
  68 -  For the |Gromacs| nonbonded interactions we use this to split the
  69    neighborlist of each :math:`i` particle into multiple separate lists,
  70    where each list has a constant shift :math:`\delta_i` for the
  71    :math:`i` partlcle. We can represent this as a sum over shifts (for
  72    which we use index :math:`s`), with the constraint that each particle
  73    interaction can only contribute to one of the terms in this sum, and
  74    the shift is no longer dependent on the :math:`j` particles. For any
  75    sum that does not contain complex dependence on :math:`s`, this means
  76    the sum trivially reduces to just the sum over :math:`i` and/or
  77    :math:`j`.
  78
  79 -  To simplify some of the sums, we replace sums over :math:`j<i` with
  80    double sums over all particles (remember, :math:`i \neq j`) and
  81    divide by 2.
  82
  83 Starting from the above definition of the virial, we then get
  84
  85 .. math::
  86
  87    \begin{aligned}
  88    \Xi
  89    &~=~&-{\frac{1}{2}}~\sum_{i < j}^{N}~{\mathbf r}^n_{ij} \otimes {\mathbf F}_{ij} \nonumber \\
  90    &~=~&-{\frac{1}{2}}~\sum_{i < j}^{N}~\left( {\mathbf r}_i + \delta_{ij} - {\mathbf r}_j \right) \otimes {\mathbf F}_{ij} \nonumber \\
  91    &~=~&-{\frac{1}{4}}~\sum_{i=1}^{N}~\sum_{j=1}^{N}~\left( {\mathbf r}_i + \delta_{ij} - {\mathbf r}_j \right) \otimes {\mathbf F}_{ij} \nonumber \\
  92    &~=~&-{\frac{1}{4}}~\sum_{i=1}^{N}~\sum_{s}~\sum_{j=1}^{N}~\left( {\mathbf r}_i + \delta_{i,s} - {\mathbf r}_j \right) \otimes {\mathbf F}_{ij,s} \nonumber \\
  93    &~=~&-{\frac{1}{4}}~\sum_{i=}^{N}~\sum_{s}~\sum_{j=1}^{N}~\left( \left( {\mathbf r}_i + \delta_{i,s} \right) \otimes {\mathbf F}_{ij,s} -{\mathbf r}_j \otimes {\mathbf F}_{ij,s} \right) \nonumber \\
  94    &~=~&-{\frac{1}{4}}~\sum_{i=1}^{N}~\sum_{s}~\sum_{j=1}^N ~\left( {\mathbf r}_i + \delta_{i,s} \right) \otimes {\mathbf F}_{ij,s} + {\frac{1}{4}}\sum_{i=1}^{N}~\sum_{s}~\sum_{j=1}^{N} {\mathbf r}_j \otimes {\mathbf F}_{ij,s} \nonumber \\
  95    &~=~&-{\frac{1}{4}}~\sum_{i=1}^{N}~\sum_{s}~\sum_{j=1}^N ~\left( {\mathbf r}_i + \delta_{i,s} \right) \otimes {\mathbf F}_{ij,s} + {\frac{1}{4}}\sum_{i=1}^{N}~\sum_{j=1}^{N} {\mathbf r}_j \otimes {\mathbf F}_{ij} \nonumber \\
  96    &~=~&-{\frac{1}{4}}~\sum_{s}~\sum_{i=1}^{N}~\left( {\mathbf r}_i + \delta_{i,s} \right) \otimes ~\sum_{j=1}^N {\mathbf F}_{ij,s} + {\frac{1}{4}}\sum_{j=1}^N {\mathbf r}_j \otimes \sum_{i=1}^{N} {\mathbf F}_{ij} \nonumber \\
  97    &~=~&-{\frac{1}{4}}~\sum_{s}~\sum_{i=1}^{N}~\left( {\mathbf r}_i + \delta_{i,s} \right) \otimes ~\sum_{j=1}^N {\mathbf F}_{ij,s} - {\frac{1}{4}}\sum_{j=1}^N {\mathbf r}_j \otimes \sum_{i=1}^{N} {\mathbf F}_{ji} \nonumber \\
  98    &~=~&-{\frac{1}{4}}~\sum_{s}~\sum_{i=1}^{N}~\left( {\mathbf r}_i + \delta_{i,s} \right) \otimes {\mathbf F}_{i,s} - {\frac{1}{4}}\sum_{j=1}^N~{\mathbf r}_j \otimes {\mathbf F}_{j}  \nonumber \\
  99    &~=~&-{\frac{1}{4}}~\left(\sum_{i=1}^{N}~{\mathbf r}_i  \otimes {\mathbf F}_{i} + \sum_{j=1}^N~{\mathbf r}_j \otimes {\mathbf F}_{j} \right) - {\frac{1}{4}}\sum_{s}~\sum_{i=1}^{N} \delta_{i,s} \otimes {\mathbf F}_{i,s}  \nonumber \\
 100    &~=~&-{\frac{1}{2}}\sum_{i=1}^{N}~{\mathbf r}_i \otimes {\mathbf F}_{i} -{\frac{1}{4}}\sum_{s}~\sum_{i=1}^{N}~\delta_{i,s} \otimes {\mathbf F}_{i,s} \nonumber \\
 101    &~=~&-{\frac{1}{2}}\sum_{i=1}^{N}~{\mathbf r}_i \otimes {\mathbf F}_{i} -{\frac{1}{4}}\sum_{s}~\delta_{s} \otimes {\mathbf F}_{s} \nonumber \\
 102    &~=~&\Xi_0 + \Xi_1\end{aligned}
 103
 104 In the second-last stage, we have used the property that each shift
 105 vector itself does not depend on the coordinates of particle :math:`i`,
 106 so it is possible to sum up all forces corresponding to each shift
 107 vector (in the nonbonded kernels), and then just use a sum over the
 108 different shift vectors outside the kernels. We have also used
 109
 110 .. math::
 111
 112    \begin{aligned}
 113    \mathbf{F}_i&~=~&\sum_{j=1}^N~\mathbf{F}_{ij}                                        \\
 114    \mathbf{F}_j&~=~&\sum_{i=1}^N~\mathbf{F}_{ji}\end{aligned}
 115
 116 which is the total force on :math:`i` with respect to :math:`j`.
 117 Because we use Newton’s Third Law:
 118
 119 .. math:: \mathbf{F}_{ij}~=~-\mathbf{F}_{ji}
 120
 121 we must, in the implementation, double the term containing the shift
 122 :math:`\delta_i`. Similarly, in a few places we have summed the
 123 shift-dependent force over all shifts to come up with the total force
 124 per interaction or particle.
 125
 126 This separates the total virial :math:`\Xi` into a component
 127 :math:`\Xi_0` that is a single sum over particles, and a second
 128 component :math:`\Xi_1` that describes the influence of the particle
 129 shifts, and that is only a sum over the different shift vectors.
 130
 131 The intra-molecular shift (mol-shift)
 132 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 133
 134 For the bonded forces and SHAKE it is possible to make a *mol-shift*
 135 list, in which the periodicity is stored. We simple have an array mshift
 136 in which for each atom an index in the shiftvec array is stored.
 137
 138 The algorithm to generate such a list can be derived from graph theory,
 139 considering each particle in a molecule as a bead in a graph, the bonds
 140 as edges.
 141
 142 #. Represent the bonds and atoms as bidirectional graph
 143
 144 #. Make all atoms white
 145
 146 #. Make one of the white atoms black (atom :math:`i`) and put it in the
 147    central box
 148
 149 #. Make all of the neighbors of :math:`i` that are currently white, gray
 150
 151 #. Pick one of the gray atoms (atom :math:`j`), give it the correct
 152    periodicity with respect to any of its black neighbors and make it
 153    black
 154
 155 #. Make all of the neighbors of :math:`j` that are currently white, gray
 156
 157 #. If any gray atom remains, go to [5]
 158
 159 #. If any white atom remains, go to [3]
 160
 161 Using this algorithm we can
 162
 163 -  optimize the bonded force calculation as well as SHAKE
 164
 165 -  calculate the virial from the bonded forces in the single sum method
 166    again
 167
 168 Find a representation of the bonds as a bidirectional graph.
 169
 170 Virial from Covalent Bonds
 171 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 172
 173 Since the covalent bond force gives a contribution to the virial, we
 174 have:
 175
 176 .. math::
 177
 178    \begin{aligned}
 179    b    &~=~&   \|\mathbf{r}_{ij}^n\|                                   \\
 180    V_b  &~=~&   \frac{1}{2} k_b(b-b_0)^2                                \\
 181    \mathbf{F}_i &~=~&   -\nabla V_b                                     \\
 182         &~=~&   k_b(b-b_0)\frac{\mathbf{r}_{ij}^n}{b}                   \\
 183    \mathbf{F}_j &~=~&   -\mathbf{F}_i\end{aligned}
 184
 185 The virial contribution from the bonds then is:
 186
 187 .. math::
 188
 189    \begin{aligned}
 190    \Xi_b        &~=~&   -\frac{1}{2}(\mathbf{r}_i^n\otimes\mathbf{F}_i~+~\mathbf{r}_j\otimes\mathbf{F}_j)       \\
 191         &~=~&   -\frac{1}{2}\mathbf{r}_{ij}^n\otimes\mathbf{F}_i\end{aligned}
 192
 193 Virial from SHAKE
 194 ~~~~~~~~~~~~~~~~~
 195
 196 An important contribution to the virial comes from shake. Satisfying the
 197 constraints a force **G** that is exerted on the particles “shaken.” If
 198 this force does not come out of the algorithm (as in standard SHAKE) it
 199 can be calculated afterward (when using *leap-frog*) by:
 200
 201 .. math::
 202
 203    \begin{aligned}
 204    \Delta\mathbf{r}_i&~=~&{\mathbf{r}_i}(t+{\Delta t})-
 205    [\mathbf{r}_i(t)+{\bf v}_i(t-\frac{{\Delta t}}{2}){\Delta t}+\frac{\mathbf{F}_i}{m_i}{\Delta t}^2]   \\
 206    {\bf G}_i&~=~&\frac{m_i{\Delta}{\mathbf{r}_i}}{{\Delta t}^2i}\end{aligned}
 207
 208 This does not help us in the general case. Only when no periodicity is
 209 needed (like in rigid water) this can be used, otherwise we must add the
 210 virial calculation in the inner loop of SHAKE.
 211
 212 When it *is* applicable the virial can be calculated in the single sum
 213 way:
 214
 215 .. math:: \Xi~=~-\frac{1}{2}\sum_i^{N_c}~\mathbf{r}_i\otimes\mathbf{F}_i
 216
 217 where :math:`N_c` is the number of constrained atoms.
 218
 219 Optimizations
 220 -------------
 221
 222 Here we describe some of the algorithmic optimizations used in |Gromacs|,
 223 apart from parallelism.
 224
 225 .. _waterloops:
 226
 227 Inner Loops for Water
 228 ~~~~~~~~~~~~~~~~~~~~~
 229
 230 |Gromacs| uses special inner loops to calculate non-bonded interactions
 231 for water molecules with other atoms, and yet another set of loops for
 232 interactions between pairs of water molecules. There highly optimized
 233 loops for two types of water models. For three site models similar to
 234 SPC \ :ref:`80 <refBerendsen81>`, *i.e.*:
 235
 236 #. There are three atoms in the molecule.
 237
 238 #. The whole molecule is a single charge group.
 239
 240 #. The first atom has Lennard-Jones (sec. :ref:`lj`) and Coulomb
 241    (sec. :ref:`coul`) interactions.
 242
 243 #. Atoms two and three have only Coulomb interactions, and equal
 244    charges.
 245
 246 These loops also works for the SPC/E \ :ref:`178 <refBerendsen87>` and
 247 TIP3P \ :ref:`128 <refJorgensen83>` water models. And for four site water
 248 models similar to TIP4P \ :ref:`128 <refJorgensen83>`:
 249
 250 #. There are four atoms in the molecule.
 251
 252 #. The whole molecule is a single charge group.
 253
 254 #. The first atom has only Lennard-Jones (sec. :ref:`lj`) interactions.
 255
 256 #. Atoms two and three have only Coulomb (sec. :ref:`coul`) interactions,
 257    and equal charges.
 258
 259 #. Atom four has only Coulomb interactions.
 260
 261 The benefit of these implementations is that there are more
 262 floating-point operations in a single loop, which implies that some
 263 compilers can schedule the code better. However, it turns out that even
 264 some of the most advanced compilers have problems with scheduling,
 265 implying that manual tweaking is necessary to get optimum performance.
 266 This may include common-sub-expression elimination, or moving code
 267 around.
 268
 269 .. raw:: latex
 270
 271     \clearpage
 272
 273
 274 .. [1]
 275    Note that some derivations, an alternative notation
 276    :math:`\xi_{\mathrm{alt}} = v_{\xi} = p_{\xi}/Q` is used.
 277
 278 .. [2]
 279    The calculation of Lennard-Jones and Coulomb forces is about 50
 280    floating point operations.