With hybrid MPI+OpenMP parallelization and the rank local FFT grid
size for a dimension not divisible by the number of OpenMP threads
in that dimension, a very small amount of the FFT grid overlap part
could set/added twice. This would only occur at low-medium MPI
parallelization with OpenMP thread counts with large prime factors,
which is practice means almost never. Even when it occured, no actual
differences in PME energies or forces were observed.
This issue was due to a leftover from when space was uniformly
divided over the grids iso assigning whole grid lines.
Fixes #1388.
Change-Id: I22904c7f55d2e96fc4b8cd1498af2087eaed47ac
/* Now loop over all the thread data blocks that contribute
* to the grid region we (our thread) are operating on.
*/
/* Now loop over all the thread data blocks that contribute
* to the grid region we (our thread) are operating on.
*/
- /* Note that ffy_nx/y is equal to the number of grid points
+ /* Note that fft_nx/y is equal to the number of grid points
* between the first point of our node grid and the one of the next node.
*/
for (sx = 0; sx >= -pmegrids->nthread_comm[XX]; sx--)
* between the first point of our node grid and the one of the next node.
*/
for (sx = 0; sx >= -pmegrids->nthread_comm[XX]; sx--)
}
pmegrid_g = &pmegrids->grid_th[fx*pmegrids->nc[YY]*pmegrids->nc[ZZ]];
ox += pmegrid_g->offset[XX];
}
pmegrid_g = &pmegrids->grid_th[fx*pmegrids->nc[YY]*pmegrids->nc[ZZ]];
ox += pmegrid_g->offset[XX];
- if (!bCommX)
- {
- tx1 = min(ox + pmegrid_g->n[XX], ne[XX]);
- }
- else
- {
- tx1 = min(ox + pmegrid_g->n[XX], pme->pme_order);
- }
+ /* Determine the end of our part of the source grid */
+ tx1 = min(ox + pmegrid_g->n[XX], ne[XX]);
for (sy = 0; sy >= -pmegrids->nthread_comm[YY]; sy--)
{
for (sy = 0; sy >= -pmegrids->nthread_comm[YY]; sy--)
{
}
pmegrid_g = &pmegrids->grid_th[fy*pmegrids->nc[ZZ]];
oy += pmegrid_g->offset[YY];
}
pmegrid_g = &pmegrids->grid_th[fy*pmegrids->nc[ZZ]];
oy += pmegrid_g->offset[YY];
- if (!bCommY)
- {
- ty1 = min(oy + pmegrid_g->n[YY], ne[YY]);
- }
- else
- {
- ty1 = min(oy + pmegrid_g->n[YY], pme->pme_order);
- }
+ /* Determine the end of our part of the source grid */
+ ty1 = min(oy + pmegrid_g->n[YY], ne[YY]);
for (sz = 0; sz >= -pmegrids->nthread_comm[ZZ]; sz--)
{
for (sz = 0; sz >= -pmegrids->nthread_comm[ZZ]; sz--)
{