.. cmake:: GMX_CYCLE_SUBCOUNTERS
+ If set to ``ON``, enables performance subcounters that offer more
+ fine-grained mdrun performance measurement and evaluation than the default
+ counters. See :doc:`/user-guide/mdrun-performance` for the description of
+ subcounters which are available.
+ Defaults to ``OFF``.
+
.. cmake:: GMX_DATA_INSTALL_DIR
Sets the directory under :file:`share/` where data files are installed.
Finding out how to run mdrun better
-----------------------------------
-TODO In future patch: red flags in log files, how to interpret wallcycle output
+
+The Wallcycle module is used for runtime performance measurement of :ref:`gmx mdrun`.
+At the end of the log file of each run, the "Real cycle and time accounting" section
+provides a table with runtime statistics for different parts of the :ref:`gmx mdrun` code
+in rows of the table.
+The table contains colums indicating the number of ranks and threads that
+executed the respective part of the run, wall-time and cycle
+count aggregates (across all threads and ranks) averaged over the entire run.
+The last column also shows what precentage of the total runtime each row represents.
+Note that the :ref:`gmx mdrun` timer resetting functionalities (`-resethway` and `-resetstep`)
+reset the performance counters and therefore are useful to avoid startup overhead and
+performance instability (e.g. due to load balancing) at the beginning of the run.
+
+The performance counters are:
+
+* Particle-particle during Particle mesh Ewald
+* Domain decomposition
+* Domain decomposition communication load
+* Domain decomposition communication bounds
+* Virtual site constraints
+* Send X to Particle mesh Ewald
+* Neighbor search
+* Launch GPU operations
+* Communication of coordinates
+* Born radii
+* Force
+* Waiting + Communication of force
+* Particle mesh Ewald
+* PME redist. X/F
+* PME spread/gather
+* PME 3D-FFT
+* PME 3D-FFT Communication
+* PME solve Lennard-Jones
+* PME solve Elec
+* PME wait for particle-particle
+* Wait + Receive PME force
+* Wait GPU nonlocal
+* Wait GPU local
+* Non-bonded position/force buffer operations
+* Virtual site spread
+* COM pull force
+* Write trajectory
+* Update
+* Constraints
+* Communication of energies
+* Enforced rotation
+* Add rotational forces
+* Position swapping
+* Interactive MD
+
+As performance data is collected for every run, they are essential to assessing
+and tuning the performance of :ref:`gmx mdrun` performance. Therefore, they benefit
+both code developers as well as users of the program.
+The counters are an average of the time/cycles different parts of the simulation take,
+hence can not directly reveal fluctuations during a single run (although comparisons across
+multiple runs are still very useful).
+
+Counters will appear in MD log file only if the related parts of the code were
+executed during the :ref:`gmx mdrun` run. There is also a special counter called "Rest" which
+indicated for the amount of time not accounted for by any of the counters above. Theerfore,
+a significant amount "Rest" time (more than a few percent) will often be an indication of
+parallelization inefficiency (e.g. serial code) and it is recommended to be reported to the
+developers.
+
+An additional set of subcounters can offer more fine-grained inspection of performance. They are:
+
+* Domain decomposition redistribution
+* DD neighbor search grid + sort
+* DD setup communication
+* DD make topology
+* DD make constraints
+* DD topology other
+* Neighbor search grid local
+* NS grid non-local
+* NS search local
+* NS search non-local
+* Bonded force
+* Bonded-FEP force
+* Restraints force
+* Listed buffer operations
+* Nonbonded force
+* Ewald force correction
+* Non-bonded position buffer operations
+* Non-bonded force buffer operations
+
+Subcounters are geared toward developers and have to be enabled during compilation. See
+:doc:`/dev-manual/build-system` for more information.
+
+TODO In future patch:
+- red flags in log files, how to interpret wallcycle output
+- hints to devs how to extend wallcycles
TODO In future patch: import wiki page stuff on performance checklist; maybe here,
maybe elsewhere
[ HIP ]
[ atoms ]
- N N -0.150599 1
- H H 0.174903 2
- CA CT -0.139355 3
- HA H1 0.103647 4
- CB CT -0.105724 5
- HB1 HC 0.102146 6
- HB2 HC 0.102146 7
- CG CC 0.051128 8
- ND1 NA 0.002100 9
- HD1 H 0.258443 10
- CE1 CR -0.033333 11
- HE1 H5 0.218884 12
- NE2 NA -0.140978 13
- HE2 H 0.353373 14
- CD2 CW -0.143848 15
- HD2 H4 0.213519 16
- C C 0.675645 17
- O O -0.542097 18
+ N N -0.424967 1
+ H H 0.285872 2
+ CA CT 0.375022 3
+ HA H1 -0.014621 4
+ CB CT -0.332123 5
+ HB1 HC 0.107725 6
+ HB2 HC 0.107725 7
+ CG CC 0.182399 8
+ ND1 NA -0.087602 9
+ HD1 H 0.305096 10
+ CE1 CR -0.013105 11
+ HE1 H5 0.230635 12
+ NE2 NA -0.148766 13
+ HE2 H 0.377295 14
+ CD2 CW -0.192052 15
+ HD2 H4 0.235237 16
+ C C 0.566646 17
+ O O -0.560417 18
[ bonds ]
N H
N CA
[ LYN ]
[ atoms ]
- N N -0.415700 1
- H H 0.271900 2
- CA CT -0.072060 3
- HA H1 0.099400 4
- CB CT -0.048450 5
- HB1 HC 0.034000 6
- HB2 HC 0.034000 7
- CG CT 0.066120 8
- HG1 HC 0.010410 9
- HG2 HC 0.010410 10
- CD CT -0.037680 11
- HD1 HC 0.011550 12
- HD2 HC 0.011550 13
- CE CT 0.326040 14
- HE1 HP -0.033580 15
- HE2 HP -0.033580 16
- NZ N3 -1.035810 17
- HZ1 H 0.386040 18
- HZ2 H 0.386040 19
- C C 0.597300 20
- O O -0.567900 21
+ N N -0.453388 1
+ H H 0.289695 2
+ CA CT -0.024500 3
+ HA H1 0.099553 4
+ CB CT 0.035478 5
+ HB1 HC 0.004797 6
+ HB2 HC 0.004797 7
+ CG CT -0.019962 8
+ HG1 HC -0.015610 9
+ HG2 HC -0.015610 10
+ CD CT 0.041105 11
+ HD1 HC 0.008304 12
+ HD2 HC 0.008304 13
+ CE CT 0.188382 14
+ HE1 HP 0.016810 15
+ HE2 HP 0.016810 16
+ NZ N3 -0.894254 17
+ HZ1 H 0.332053 18
+ HZ2 H 0.332053 19
+ C C 0.608464 20
+ O O -0.563281 21
[ bonds ]
N H
N CA
GLY 2
1 1 HN N -C CA
2 6 HA CA C N
-HIS1 6
-1 1 HN N -C CA
-1 5 HA CA N C CB
-2 6 HB CB CG CA
-1 1 HD2 CD2 CG NE2
-1 1 HE1 CE1 ND1 NE2
-1 1 HD1 ND1 CG CE1
HSD 6
1 1 HN N -C CA
1 5 HA CA N C CB
GLY 2
1 1 H N -C CA
2 6 HA CA C N
-HIS1 6
-1 1 H N -C CA
-1 5 HA CA N C CB
-2 6 HB CB CG CA
-1 1 HD2 CD2 CG NE2
-1 1 HE1 CE1 ND1 NE2
-1 1 HD1 ND1 CG CE1
HISD 6
1 1 H N -C CA
1 5 HA CA N C CB
ND1 NE2 CE1 HE1 improper_Z_CA_X_Y
-[ HIS1 ] ; Identical to HISD
- [ atoms ]
- N opls_238 -0.500 1
- H opls_241 0.300 1
- CA opls_224B 0.140 1
- HA opls_140 0.060 1
- CB opls_505 -0.297 2
- HB1 opls_140 0.060 2
- HB2 opls_140 0.060 2
- CG opls_508 -0.261 3
- ND1 opls_503 -0.291 4
- HD1 opls_504 0.326 4
- CD2 opls_507 0.504 5
- HD2 opls_146 0.183 5
- CE1 opls_506 0.182 6
- HE1 opls_146 0.098 6
- NE2 opls_511 -0.564 7
- C opls_235 0.500 8
- O opls_236 -0.500 8
- [ bonds ]
- N H
- N CA
- CA HA
- CA CB
- CA C
- CB HB1
- CB HB2
- CB CG
- CG ND1
- CG CD2
- ND1 HD1
- ND1 CE1
- CD2 HD2
- CD2 NE2
- CE1 HE1
- CE1 NE2
- C O
- -C N
- [ dihedrals ] ; override some with residue-specific ones
- N CA CB CG dih_HIS_chi1_N_C_C_C
- CG CB CA C dih_HIS_chi1_C_C_C_CO
- CA CB CG ND1 dih_HIS_chi2_C_C_C_N
- [ impropers ]
- -C CA N H improper_Z_N_X_Y
- CA +N C O improper_O_C_X_Y
- ND1 CD2 CG CB improper_Z_CA_X_Y
- CG CE1 ND1 HD1 improper_Z_N_X_Y
- CG NE2 CD2 HD2 improper_Z_CA_X_Y
- ND1 NE2 CE1 HE1 improper_Z_CA_X_Y
-
-
[ HISE ]
[ atoms ]
N opls_238 -0.500 1
while (((i+n) < disres->nr) &&
(forceparams[forceatoms[i+n]].disres.label == label));
- calc_disres_R_6(n, &forceatoms[i], forceparams,
+ calc_disres_R_6(NULL, n, &forceatoms[i],
(const rvec*)x, pbc, fcd, NULL);
if (fcd->disres.Rt_6[0] <= 0)
#include "gromacs/topology/topology.h"
#include "gromacs/utility/fatalerror.h"
#include "gromacs/utility/futil.h"
+#include "gromacs/utility/gmxassert.h"
#include "gromacs/utility/pleasecite.h"
#include "gromacs/utility/smalloc.h"
gmx_mtop_ilistloop_t iloop;
t_ilist *il;
char *ptr;
+ int type_min, type_max;
dd = &(fcd->disres);
fprintf(fplog, "Initializing the distance restraints\n");
}
-
- if (ir->eDisre == edrEnsemble)
- {
- gmx_fatal(FARGS, "Sorry, distance restraints with ensemble averaging over multiple molecules in one system are not functional in this version of GROMACS");
- }
-
dd->dr_weighting = ir->eDisreWeighting;
dd->dr_fc = ir->dr_fc;
if (EI_DYNAMICS(ir->eI))
}
else
{
+ /* We store the r^-6 time averages in an array that is indexed
+ * with the local disres iatom index, so this doesn't work with DD.
+ * Note that DD is not initialized yet here, so we check for PAR(cr),
+ * but there are probably also issues with e.g. NM MPI parallelization.
+ */
+ if (cr && PAR(cr))
+ {
+ gmx_fatal(FARGS, "Time-averaged distance restraints are not supported with MPI parallelization. You can use OpenMP parallelization on a single node.");
+ }
+
dd->dr_bMixed = ir->bDisreMixed;
dd->ETerm = std::exp(-(ir->delta_t/ir->dr_tau));
}
dd->nres = 0;
dd->npair = 0;
+ type_min = INT_MAX;
+ type_max = 0;
iloop = gmx_mtop_ilistloop_init(mtop);
while (gmx_mtop_ilistloop_next(iloop, &il, &nmol))
{
+ if (nmol > 1 && ir->eDisre != edrEnsemble)
+ {
+ gmx_fatal(FARGS, "NMR distance restraints with multiple copies of the same molecule are currently only supported with ensemble averaging. If you just want to restrain distances between atom pairs using a flat-bottomed potential, use a restraint potential (bonds type 10) instead.");
+ }
+
np = 0;
for (fa = 0; fa < il[F_DISRES].nr; fa += 3)
{
+ int type;
+
+ type = il[F_DISRES].iatoms[fa];
+
np++;
- npair = mtop->ffparams.iparams[il[F_DISRES].iatoms[fa]].disres.npair;
+ npair = mtop->ffparams.iparams[type].disres.npair;
if (np == npair)
{
- dd->nres += (ir->eDisre == edrEnsemble ? 1 : nmol)*npair;
+ dd->nres += (ir->eDisre == edrEnsemble ? 1 : nmol);
dd->npair += nmol*npair;
np = 0;
+
+ type_min = std::min(type_min, type);
+ type_max = std::max(type_max, type);
}
}
}
- if (cr && PAR(cr))
+ if (cr && PAR(cr) && ir->nstdisreout > 0)
{
- /* Temporary check, will be removed when disre is implemented with DD */
- const char *notestr = "NOTE: atoms involved in distance restraints should be within the same domain. If this is not the case mdrun generates a fatal error. If you encounter this, use a single MPI rank (Verlet+OpenMP+GPUs work fine).";
+ /* With DD we currently only have local pair information available */
+ gmx_fatal(FARGS, "With MPI parallelization distance-restraint pair output is not supported. Use nstdisreout=0 or use OpenMP parallelization on a single node.");
+ }
- if (MASTER(cr))
- {
- fprintf(stderr, "\n%s\n\n", notestr);
- }
- if (fplog)
- {
- fprintf(fplog, "%s\n", notestr);
- }
+ /* For communicating and/or reducing (sums of) r^-6 for pairs over threads
+ * we use multiple arrays in t_disresdata. We need to have unique indices
+ * for each restraint that work over threads and MPI ranks. To this end
+ * we use the type index. These should all be in one block and there should
+ * be dd->nres types, but we check for this here.
+ * This setup currently does not allow for multiple copies of the same
+ * molecule without ensemble averaging, this is check for above.
+ */
+ GMX_RELEASE_ASSERT(type_max - type_min + 1 == dd->nres, "All distance restraint parameter entries in the topology should be consecutive");
- if (dd->dr_tau != 0 || ir->eDisre == edrEnsemble ||
- dd->nres != dd->npair)
- {
- gmx_fatal(FARGS, "Time or ensemble averaged or multiple pair distance restraints do not work (yet) with domain decomposition, use a single MPI rank%s", cr->ms ? " per simulation" : "");
- }
- if (ir->nstdisreout != 0)
- {
- if (fplog)
- {
- fprintf(fplog, "\nWARNING: Can not write distance restraint data to energy file with domain decomposition\n\n");
- }
- if (MASTER(cr))
- {
- fprintf(stderr, "\nWARNING: Can not write distance restraint data to energy file with domain decomposition\n");
- }
- ir->nstdisreout = 0;
- }
- }
+ dd->type_min = type_min;
snew(dd->rt, dd->npair);
}
fprintf(fplog, "\n");
}
- snew(dd->Rtl_6, dd->nres);
#endif
}
else
{
dd->nsystems = 1;
+ }
+
+ if (dd->nsystems == 1)
+ {
dd->Rtl_6 = dd->Rt_6;
}
+ else
+ {
+ snew(dd->Rtl_6, dd->nres);
+ }
if (dd->npair > 0)
{
}
}
-void calc_disres_R_6(int nfa, const t_iatom forceatoms[], const t_iparams ip[],
+void calc_disres_R_6(const t_commrec *cr,
+ int nfa, const t_iatom forceatoms[],
const rvec x[], const t_pbc *pbc,
t_fcdata *fcd, history_t *hist)
{
- int ai, aj;
- int fa, res, pair;
- int type, npair, np;
rvec dx;
real *rt, *rm3tav, *Rtl_6, *Rt_6, *Rtav_6;
- real rt_1, rt_3, rt2;
t_disresdata *dd;
- real ETerm, ETerm1, cf1 = 0, cf2 = 0, invn = 0;
+ real ETerm, ETerm1, cf1 = 0, cf2 = 0;
gmx_bool bTav;
dd = &(fcd->disres);
cf2 = 1.0/(1.0 - dd->exp_min_t_tau);
}
- if (dd->nsystems > 1)
+ for (int res = 0; res < dd->nres; res++)
{
- invn = 1.0/dd->nsystems;
+ Rtav_6[res] = 0.0;
+ Rt_6[res] = 0.0;
}
/* 'loop' over all atom pairs (pair_nr=fa/3) involved in restraints, *
* the total number of atoms pairs is nfa/3 */
- res = 0;
- fa = 0;
- while (fa < nfa)
+ for (int fa = 0; fa < nfa; fa += 3)
{
- type = forceatoms[fa];
- npair = ip[type].disres.npair;
+ int type = forceatoms[fa];
+ int res = type - dd->type_min;
+ int pair = fa/3;
+ int ai = forceatoms[fa+1];
+ int aj = forceatoms[fa+2];
- Rtav_6[res] = 0.0;
- Rt_6[res] = 0.0;
+ if (pbc)
+ {
+ pbc_dx_aiuc(pbc, x[ai], x[aj], dx);
+ }
+ else
+ {
+ rvec_sub(x[ai], x[aj], dx);
+ }
+ real rt2 = iprod(dx, dx);
+ real rt_1 = gmx::invsqrt(rt2);
+ real rt_3 = rt_1*rt_1*rt_1;
- /* Loop over the atom pairs of 'this' restraint */
- np = 0;
- while (fa < nfa && np < npair)
+ rt[pair] = rt2*rt_1;
+ if (bTav)
{
- pair = fa/3;
- ai = forceatoms[fa+1];
- aj = forceatoms[fa+2];
+ /* Here we update rm3tav in t_fcdata using the data
+ * in history_t.
+ * Thus the results stay correct when this routine
+ * is called multiple times.
+ */
+ rm3tav[pair] = cf2*((ETerm - cf1)*hist->disre_rm3tav[pair] +
+ ETerm1*rt_3);
+ }
+ else
+ {
+ rm3tav[pair] = rt_3;
+ }
- if (pbc)
- {
- pbc_dx_aiuc(pbc, x[ai], x[aj], dx);
- }
- else
- {
- rvec_sub(x[ai], x[aj], dx);
- }
- rt2 = iprod(dx, dx);
- rt_1 = gmx::invsqrt(rt2);
- rt_3 = rt_1*rt_1*rt_1;
+ /* Currently divide_bondeds_over_threads() ensures that pairs within
+ * the same restraint get assigned to the same thread, so we could
+ * run this loop thread-parallel.
+ */
+ Rt_6[res] += rt_3*rt_3;
+ Rtav_6[res] += rm3tav[pair]*rm3tav[pair];
+ }
- rt[pair] = std::sqrt(rt2);
- if (bTav)
- {
- /* Here we update rm3tav in t_fcdata using the data
- * in history_t.
- * Thus the results stay correct when this routine
- * is called multiple times.
- */
- rm3tav[pair] = cf2*((ETerm - cf1)*hist->disre_rm3tav[pair] +
- ETerm1*rt_3);
- }
- else
- {
- rm3tav[pair] = rt_3;
- }
+ /* NOTE: Rt_6 and Rtav_6 are stored consecutively in memory */
+ if (cr && DOMAINDECOMP(cr))
+ {
+ gmx_sum(2*dd->nres, dd->Rt_6, cr);
+ }
- Rt_6[res] += rt_3*rt_3;
- Rtav_6[res] += rm3tav[pair]*rm3tav[pair];
+ if (fcd->disres.nsystems > 1)
+ {
+ real invn = 1.0/dd->nsystems;
- fa += 3;
- np++;
- }
- if (dd->nsystems > 1)
+ for (int res = 0; res < dd->nres; res++)
{
Rtl_6[res] = Rt_6[res];
Rt_6[res] *= invn;
Rtav_6[res] *= invn;
}
- res++;
+ GMX_ASSERT(cr != NULL && cr->ms != NULL, "We need multisim with nsystems>1");
+ gmx_sum_sim(2*dd->nres, dd->Rt_6, cr->ms);
+
+ if (DOMAINDECOMP(cr))
+ {
+ gmx_bcast(2*dd->nres, dd->Rt_6, cr);
+ }
}
+
+ /* Store the base forceatoms pointer, so we can re-calculate the pair
+ * index in ta_disres() for indexing pair data in t_disresdata when
+ * using thread parallelization.
+ */
+ dd->forceatomsStart = forceatoms;
+
+ dd->sumviol = 0;
}
real ta_disres(int nfa, const t_iatom forceatoms[], const t_iparams ip[],
{
const real seven_three = 7.0/3.0;
- int ai, aj;
- int fa, res, npair, p, pair, ki = CENTRAL, m;
- int type;
rvec dx;
real weight_rt_1;
real smooth_fc, Rt, Rtav, rt2, *Rtl_6, *Rt_6, *Rtav_6;
/* 'loop' over all atom pairs (pair_nr=fa/3) involved in restraints, *
* the total number of atoms pairs is nfa/3 */
- res = 0;
- fa = 0;
- while (fa < nfa)
+ int faOffset = static_cast<int>(forceatoms - dd->forceatomsStart);
+ for (int fa = 0; fa < nfa; fa += 3)
{
- type = forceatoms[fa];
- /* Take action depending on restraint, calculate scalar force */
- npair = ip[type].disres.npair;
- up1 = ip[type].disres.up1;
- up2 = ip[type].disres.up2;
- low = ip[type].disres.low;
- k0 = smooth_fc*ip[type].disres.kfac;
+ int type = forceatoms[fa];
+ int npair = ip[type].disres.npair;
+ up1 = ip[type].disres.up1;
+ up2 = ip[type].disres.up2;
+ low = ip[type].disres.low;
+ k0 = smooth_fc*ip[type].disres.kfac;
+
+ int res = type - dd->type_min;
/* save some flops when there is only one pair */
if (ip[type].disres.type != 2)
}
else
{
- /* When rtype=2 use instantaneous not ensemble avereged distance */
+ /* When rtype=2 use instantaneous not ensemble averaged distance */
bConservative = (npair > 1);
bMixed = FALSE;
Rt = gmx::invsixthroot(Rtl_6[res]);
if (bViolation)
{
+ /* Add 1/npair energy and violation for each of the npair pairs */
+ real pairFac = 1/static_cast<real>(npair);
+
/* NOTE:
* there is no real potential when time averaging is applied
*/
- vtot += 0.5*k0*gmx::square(tav_viol);
- if (1/vtot == 0)
- {
- printf("vtot is inf: %f\n", vtot);
- }
+ vtot += 0.5*k0*gmx::square(tav_viol)*pairFac;
if (!bMixed)
{
f_scal = -k0*tav_viol;
- violtot += fabs(tav_viol);
+ violtot += fabs(tav_viol)*pairFac;
}
else
{
{
mixed_viol = std::sqrt(tav_viol*instant_viol);
f_scal = -k0*mixed_viol;
- violtot += mixed_viol;
+ violtot += mixed_viol*pairFac;
}
}
}
/* Exert the force ... */
- /* Loop over the atom pairs of 'this' restraint */
- for (p = 0; p < npair; p++)
+ int pair = (faOffset + fa)/3;
+ int ai = forceatoms[fa+1];
+ int aj = forceatoms[fa+2];
+ int ki = CENTRAL;
+ if (pbc)
{
- pair = fa/3;
- ai = forceatoms[fa+1];
- aj = forceatoms[fa+2];
+ ki = pbc_dx_aiuc(pbc, x[ai], x[aj], dx);
+ }
+ else
+ {
+ rvec_sub(x[ai], x[aj], dx);
+ }
+ rt2 = iprod(dx, dx);
+
+ weight_rt_1 = gmx::invsqrt(rt2);
- if (pbc)
+ if (bConservative)
+ {
+ if (!dr_bMixed)
{
- ki = pbc_dx_aiuc(pbc, x[ai], x[aj], dx);
+ weight_rt_1 *= std::pow(dd->rm3tav[pair], seven_three);
}
else
{
- rvec_sub(x[ai], x[aj], dx);
- }
- rt2 = iprod(dx, dx);
-
- weight_rt_1 = gmx::invsqrt(rt2);
-
- if (bConservative)
- {
- if (!dr_bMixed)
- {
- weight_rt_1 *= std::pow(dd->rm3tav[pair], seven_three);
- }
- else
- {
- weight_rt_1 *= tav_viol_Rtav7*std::pow(dd->rm3tav[pair], seven_three)+
- instant_viol_Rtav7/(dd->rt[pair]*gmx::power6(dd->rt[pair]));
- }
+ weight_rt_1 *= tav_viol_Rtav7*std::pow(dd->rm3tav[pair], seven_three)+
+ instant_viol_Rtav7/(dd->rt[pair]*gmx::power6(dd->rt[pair]));
}
+ }
- fk_scal = f_scal*weight_rt_1;
+ fk_scal = f_scal*weight_rt_1;
- if (g)
- {
- ivec_sub(SHIFT_IVEC(g, ai), SHIFT_IVEC(g, aj), dt);
- ki = IVEC2IS(dt);
- }
+ if (g)
+ {
+ ivec_sub(SHIFT_IVEC(g, ai), SHIFT_IVEC(g, aj), dt);
+ ki = IVEC2IS(dt);
+ }
- for (m = 0; m < DIM; m++)
- {
- fij = fk_scal*dx[m];
+ for (int m = 0; m < DIM; m++)
+ {
+ fij = fk_scal*dx[m];
- f[ai][m] += fij;
- f[aj][m] -= fij;
- fshift[ki][m] += fij;
- fshift[CENTRAL][m] -= fij;
- }
- fa += 3;
+ f[ai][m] += fij;
+ f[aj][m] -= fij;
+ fshift[ki][m] += fij;
+ fshift[CENTRAL][m] -= fij;
}
}
- else
- {
- /* No violation so force and potential contributions */
- fa += 3*npair;
- }
- res++;
}
- dd->sumviol = violtot;
+#pragma omp atomic
+ dd->sumviol += violtot;
/* Return energy */
return vtot;
*
* Copyright (c) 1991-2000, University of Groningen, The Netherlands.
* Copyright (c) 2001-2004, The GROMACS development team.
- * Copyright (c) 2013,2014,2015, by the GROMACS development team, led by
+ * Copyright (c) 2013,2014,2015,2016, by the GROMACS development team, led by
* Mark Abraham, David van der Spoel, Berk Hess, and Erik Lindahl,
* and including many others, as listed in the AUTHORS file in the
* top-level source directory and at http://www.gromacs.org.
* Calculates r and r^-3 (inst. and time averaged) for all pairs
* and the ensemble averaged r^-6 (inst. and time averaged) for all restraints
*/
-void calc_disres_R_6(int nfa, const t_iatom *fa, const t_iparams ip[],
+void calc_disres_R_6(const t_commrec *cr,
+ int nfa, const t_iatom *fa,
const rvec *x, const t_pbc *pbc,
t_fcdata *fcd, history_t *hist);
#include "listed-forces.h"
-#include "config.h"
-
#include <assert.h>
#include <algorithm>
#include "gromacs/math/vec.h"
#include "gromacs/mdlib/force.h"
#include "gromacs/mdlib/force_flags.h"
+#include "gromacs/mdtypes/commrec.h"
#include "gromacs/mdtypes/fcdata.h"
#include "gromacs/mdtypes/forcerec.h"
#include "gromacs/mdtypes/inputrec.h"
(ftype < F_GB12 || ftype > F_GB14);
}
-void calc_listed(const struct gmx_multisim_t *ms,
+void calc_listed(const t_commrec *cr,
struct gmx_wallcycle *wcycle,
const t_idef *idef,
const rvec x[], history_t *hist,
if ((idef->il[F_POSRES].nr > 0) ||
(idef->il[F_FBPOSRES].nr > 0) ||
- (idef->il[F_ORIRES].nr > 0) ||
- (idef->il[F_DISRES].nr > 0))
+ fcd->orires.nr > 0 ||
+ fcd->disres.nres > 0)
{
/* TODO Use of restraints triggers further function calls
inside the loop over calc_one_bond(), but those are too
}
/* Do pre force calculation stuff which might require communication */
- if (idef->il[F_ORIRES].nr > 0)
+ if (fcd->orires.nr > 0)
{
enerd->term[F_ORIRESDEV] =
- calc_orires_dev(ms, idef->il[F_ORIRES].nr,
+ calc_orires_dev(cr->ms, idef->il[F_ORIRES].nr,
idef->il[F_ORIRES].iatoms,
idef->iparams, md, x,
pbc_null, fcd, hist);
}
- if (idef->il[F_DISRES].nr)
+ if (fcd->disres.nres > 0)
{
- calc_disres_R_6(idef->il[F_DISRES].nr,
+ calc_disres_R_6(cr,
+ idef->il[F_DISRES].nr,
idef->il[F_DISRES].iatoms,
- idef->iparams, x, pbc_null,
+ x, pbc_null,
fcd, hist);
-#if GMX_MPI
- if (fcd->disres.nsystems > 1)
- {
- gmx_sum_sim(2*fcd->disres.nres, fcd->disres.Rt_6, ms);
- }
-#endif
}
wallcycle_sub_stop(wcycle, ewcsRESTRAINTS);
do_force_listed(struct gmx_wallcycle *wcycle,
matrix box,
const t_lambda *fepvals,
- const struct gmx_multisim_t *ms,
+ const t_commrec *cr,
const t_idef *idef,
const rvec x[],
history_t *hist,
/* Not enough flops to bother counting */
set_pbc(&pbc_full, fr->ePBC, box);
}
- calc_listed(ms, wcycle, idef, x, hist, f, fr, pbc, &pbc_full,
+ calc_listed(cr, wcycle, idef, x, hist, f, fr, pbc, &pbc_full,
graph, enerd, nrnb, lambda, md, fcd,
global_atom_index, flags);
/*
* This file is part of the GROMACS molecular simulation package.
*
- * Copyright (c) 2014,2015, by the GROMACS development team, led by
+ * Copyright (c) 2014,2015,2016, by the GROMACS development team, led by
* Mark Abraham, David van der Spoel, Berk Hess, and Erik Lindahl,
* and including many others, as listed in the AUTHORS file in the
* top-level source directory and at http://www.gromacs.org.
struct gmx_enerdata_t;
struct gmx_grppairener_t;
-struct gmx_multisim_t;
struct history_t;
+struct t_commrec;
struct t_fcdata;
struct t_forcerec;
struct t_idef;
*
* Note that pbc_full is used only for position restraints, and is
* not initialized if there are none. */
-void calc_listed(const struct gmx_multisim_t *ms,
+void calc_listed(const t_commrec *cr,
struct gmx_wallcycle *wcycle,
const t_idef *idef,
const rvec x[], history_t *hist,
do_force_listed(struct gmx_wallcycle *wcycle,
matrix box,
const t_lambda *fepvals,
- const struct gmx_multisim_t *ms,
+ const t_commrec *cr,
const t_idef *idef,
const rvec x[],
history_t *hist,
TRUE, box);
}
- do_force_listed(wcycle, box, ir->fepvals, cr->ms,
+ do_force_listed(wcycle, box, ir->fepvals, cr,
idef, (const rvec *) x, hist, f, fr,
&pbc, graph, enerd, nrnb, lambda, md, fcd,
DOMAINDECOMP(cr) ? cr->dd->gatindex : NULL,
{
t_state *s1, *s2;
- int i;
int start, end;
- rvec *x1, *x2;
real dvdl_constr;
int nthreads gmx_unused;
s2->natoms = s1->natoms;
copy_mat(s1->box, s2->box);
/* Copy free energy state */
- for (i = 0; i < efptNR; i++)
+ for (int i = 0; i < efptNR; i++)
{
s2->lambda[i] = s1->lambda[i];
}
start = 0;
end = md->homenr;
- x1 = s1->x;
- x2 = s2->x;
-
// cppcheck-suppress unreadVariable
nthreads = gmx_omp_nthreads_get(emntUpdate);
#pragma omp parallel num_threads(nthreads)
{
- int gf, i, m;
+ rvec *x1 = s1->x;
+ rvec *x2 = s2->x;
- gf = 0;
+ int gf = 0;
#pragma omp for schedule(static) nowait
- for (i = start; i < end; i++)
+ for (int i = start; i < end; i++)
{
try
{
{
gf = md->cFREEZE[i];
}
- for (m = 0; m < DIM; m++)
+ for (int m = 0; m < DIM; m++)
{
if (ir->opts.nFreeze[gf][m])
{
if (s2->flags & (1<<estCGP))
{
/* Copy the CG p vector */
- x1 = s1->cg_p;
- x2 = s2->cg_p;
+ rvec *p1 = s1->cg_p;
+ rvec *p2 = s2->cg_p;
#pragma omp for schedule(static) nowait
- for (i = start; i < end; i++)
+ for (int i = start; i < end; i++)
{
// Trivial OpenMP block that does not throw
- copy_rvec(x1[i], x2[i]);
+ copy_rvec(p1[i], p2[i]);
}
}
}
s2->ncg_gl = s1->ncg_gl;
#pragma omp for schedule(static) nowait
- for (i = 0; i < s2->ncg_gl; i++)
+ for (int i = 0; i < s2->ncg_gl; i++)
{
s2->cg_gl[i] = s1->cg_gl[i];
}
if (cl_error)
{
- printf("ClERROR! %d\n", cl_error);
+ printf("OpenCL error: %s\n", ocl_get_error_string(cl_error).c_str());
}
cl_error = clEnqueueNDRangeKernel(stream, nb_kernel, 3, NULL, global_work_size, local_work_size, 0, NULL, bDoTime ? &(t->nb_k[iloc]) : NULL);
assert(cl_error == CL_SUCCESS);
snew(shfc, 1);
shfc->nflexcon = nflexcon;
+ if (nshell == 0)
+ {
+ /* Only flexible constraints, no shells.
+ * Note that make_local_shells() does not need to be called.
+ */
+ shfc->nshell = 0;
+ shfc->bPredict = FALSE;
+
+ return shfc;
+ }
+
if (nstcalcenergy != 1)
{
gmx_fatal(FARGS, "You have nstcalcenergy set to a value (%d) that is different from 1.\nThis is not supported in combination with shell particles.\nPlease make a new tpr file.", nstcalcenergy);
gmx_fatal(FARGS, "Shell particles are not implemented with domain decomposition, use a single rank");
}
- if (nshell == 0)
- {
- return shfc;
- }
-
/* We have shells: fill the shell data structure */
/* Global system sized array, this should be avoided */
*
* Copyright (c) 1991-2000, University of Groningen, The Netherlands.
* Copyright (c) 2001-2004, The GROMACS development team.
- * Copyright (c) 2012,2014,2015, by the GROMACS development team, led by
+ * Copyright (c) 2012,2014,2015,2016, by the GROMACS development team, led by
* Mark Abraham, David van der Spoel, Berk Hess, and Erik Lindahl,
* and including many others, as listed in the AUTHORS file in the
* top-level source directory and at http://www.gromacs.org.
#define GMX_MDTYPES_FCDATA_H
#include "gromacs/math/vectypes.h"
+#include "gromacs/topology/idef.h"
#include "gromacs/utility/basedefinitions.h"
#include "gromacs/utility/real.h"
real exp_min_t_tau; /* Factor for slowly switching on the force */
int nres; /* The number of distance restraints */
int npair; /* The number of distance restraint pairs */
+ int type_min; /* The minimum iparam type index for restraints */
real sumviol; /* The sum of violations */
- real *rt; /* The calculated instantaneous distance (npr) */
- real *rm3tav; /* The calculated time averaged distance (npr) */
- real *Rtl_6; /* The calculated instantaneous r^-6 (nr) */
- real *Rt_6; /* The calculated inst. ens. averaged r^-6 (nr) */
- real *Rtav_6; /* The calculated time and ens. averaged r^-6 (nr) */
+ real *rt; /* The instantaneous distance (npair) */
+ real *rm3tav; /* The time averaged distance (npair) */
+ real *Rtl_6; /* The instantaneous r^-6 (nres) */
+ real *Rt_6; /* The instantaneous ensemble averaged r^-6 (nres) */
+ real *Rtav_6; /* The time and ensemble averaged r^-6 (nres) */
int nsystems; /* The number of systems for ensemble averaging */
+
+ /* TODO: Implement a proper solution for parallel disre indexing */
+ const t_iatom *forceatomsStart; /* Pointer to the start of the disre forceatoms */
} t_disresdata;