Alan Gray [Mon, 28 Oct 2019 09:07:46 +0000 (02:07 -0700)]
Bug fix for event passed to GPU comm features
The GPU coordinate halo exchange and coordinate PME-PP transfer were
waiting on an event signalling that the coordinates are available on
the GPU that lives in the NB class, but the new state propagator
feature instead records a different event for this purpose, breaking
the dependency. This change fixes the bug by instead pass the state
propagator event as a dependency to these methods. It needs to be done
every step rather than stored in the class since it can change
depending on the type of step.
Change-Id: I7ea048c6f303192c1310d6b5593227c0ad9a81d0
Pascal Merz [Thu, 31 Oct 2019 04:51:56 +0000 (22:51 -0600)]
Check for contradicting environment variables (modular simulator)
Having both environment variables GMX_USE_MODULAR_SIMULATOR=ON and
GMX_DISABLE_MODULAR_SIMULATOR=ON set has no defined behavior and
should therefore yield an error. This change adds a release assert
to ensure this.
Change-Id: I1df0d94da7312d6c31e7e520db6c0186707b3cac
Mark Abraham [Tue, 3 Sep 2019 13:24:27 +0000 (15:24 +0200)]
Tests for valid periodic actions
We expect that mdrun propagation is unaffected by changing mdp options
that determine whether output is written. However, orchestrating mdrun
to collect and compute such data without affecting the propagation is
complex and currently very fragile. New propagation approaches must be
able to be tested.
Many mdrun combinations of periodic outputs and periodic action of
simulation modules that affect propagation are compared for
correctness against a simulation that did every action at every step.
These tests are fairly slow, so are in their own test binary and
annotated appropriately. They run by default only in release-type
builds. As they target testing the kind of coordination issues that
tend to appear in multi-rank runs, those runs are specifically
targeted.
The energy tolerance for the mdrun test were far too tight. It seems
that tests passed anyhow because they compared runs under exactly
the same run conditions using (nearly) the same summation order.
This change slightly increases the tolerance for energies and
massively the tolerance for pressure comparison.
Change-Id: I88ea643873ebec0e5e2b12181f4b51ad90c7b0f7
M. Eric Irrgang [Wed, 30 Oct 2019 10:41:52 +0000 (13:41 +0300)]
Deduplicate gmxapi documentation.
`docs/gmxapi` seems to have settled down and does not have substantial
layout changes from the documentation in python_packaging/documentation.
Compartmentalized doc builds, e.g. with docs.dockerfile, still warrant
a separate simple conf.py, so that much is retained. Otherwise, it looks
like documentation for the standalone gmxapi package or development can
be maintained in a unified way with the GROMACS project documentation,
so this change removes the transitionally duplicated content.
* Remove most of python_packaging/documentation
* Update python_packaging/docker/docs.dockerfile
Refs #2698
Refs #2985
Change-Id: I0c23f6526894ec1eec4e910463c8c9d08a7315f6
Artem Zhmurov [Thu, 17 Oct 2019 13:36:09 +0000 (15:36 +0200)]
Remove excessive H2D and D2H copies of forces when update is offloaded
Forces are copied H2D only on steps when force buffer ops is not
offloaded to the GPU (e.g. steps when virials are computed).
The D2H copy is issued:
- after force buffer ops if the update is not offloaded or if the
forces are needed on CPU by seprate PME rank force reduction or
for vsites force spreading.
- for the force output if update is offloaded and forces were not copied
yet.
Also added note regarding the need for the latter copy to ensure forces
are ready using the same mechanism as used for synchronizing update on
the availability of forces.
Change-Id: Ied74638d35e74a8970427df712501cd63e0aa0ab
Berk Hess [Wed, 30 Oct 2019 15:49:21 +0000 (16:49 +0100)]
Update GPU update restrictions
Ewald surface correction does not work and distance restraints do.
The graph can not be used, which is now asserted on.
Updated the user guide for these changes and missing restrictions.
Change-Id: Ica477462d55c047c8ae07284bbe20fba6f332f6b
Pascal Merz [Tue, 29 Oct 2019 22:55:48 +0000 (16:55 -0600)]
Fix doxygen for modular simulator
This fixes a few issues with the doxygen documentation for the
modular simulator:
* Add \file tag for all files
* Switching a few \internal to \libinternal in header files to keep
uniformity across modular simulator
* Making source file documentation \internal to avoid it appearing
outside full documentation
* Replace
//! \addtogroup module_modularsimulator
//! \{
...
//! \}
syntax with
\ingroup module_modularsimulator
except in files having one-line doxygen commands to avoid
bloating these
* Finally, solve an instance of doxygen complaining about a typedef
not being documented.
Change-Id: Ib74c57b580f70cf76091377fb8c49e1cc2bdfc2f
Jonathan Vincent [Wed, 2 Jan 2019 16:59:18 +0000 (08:59 -0800)]
Update PME CUDA spread/gather
Adds addtional templated kernels to the CUDA spread and
gather kernels. Allowing the use of 4 threads per atom instead of
16 and allowing the spline data to be recalculated in the spread
instead of saved to global memory and reloaded.
The combinations mean we have 4 different kernels that can be called
depending on which is most appropriate for the problem size and
hardware (to be decided heuritically). By default existing method is
used (16 threads per atom, saving and reloading of spline data).
Added an additional option to disable the preloading of charges and
coordinates into shared memory, and instead each thread would
deal with a single atom.
Removed the (currently disabled) PME_GPU_PARALLEL_SPLINE=1 code
path.
Refs #2792 #3185 #3186 #3187 #3188
Change-Id: Ia48d8eb63e38d0d23eefd755dcc228ff9b66d3e6
Pascal Merz [Fri, 25 Oct 2019 01:44:10 +0000 (19:44 -0600)]
Fix modular simulator documentation
This fixes a few issues with the modular simulator documentation:
* VRescaleThermostat was not properly documented
* ParrinelloRahmanBarostat was not properly documented
* FreeEnergyPerturbationElement was missing in modularsimulator.md
* Some details in modularsimulator.md were not up-to-date
Change-Id: I2236d4c77c8a3ad700ea914600f9f8ceac885292
Berk Hess [Tue, 29 Oct 2019 14:15:26 +0000 (15:15 +0100)]
Fix walls with GPU update
Change-Id: I2e8568c9eab319065355c7a2596110bf1c625b31
Pascal Merz [Thu, 24 Oct 2019 00:17:41 +0000 (18:17 -0600)]
Fix compatibility check of modular simulator
Testing for additional functionality not implemented by modular
simulator during compatibility check:
* Pulling
* Acceleration
* Freeze
* Deform
Change-Id: Ib0523500e8c681631494053126a2ff80ce4f2697
M. Eric Irrgang [Wed, 16 Oct 2019 14:38:29 +0000 (17:38 +0300)]
Disable its tests when sample_restraint is not the master project.
Avoids some distracting CMake output for GROMACS builds, and avoids
trying to download a test input file.
Fixes #3111
Change-Id: I53f4d88184a5112a430a051c086ffa864b0eeb41
M. Eric Irrgang [Thu, 24 Oct 2019 15:25:05 +0000 (18:25 +0300)]
Decrease the maximum allowed Python version for gmxapi.
Update setup.py to require Python version <3.9. We should allow ourselves
a chance to review and respond to changes with the Python 3.9 release
before asserting support.
Changes with Python feature releases aren't likely to affect scientific
results, but could cost valuable computing resources. Historically,
gmxapi updates for feature releases have included Python base class
adjustments (responding to changes in the `typing` module with Python
3.7) and pybind11 header updates (responding to Python C API changes).
Resolves #3175
Change-Id: I7805964669ac74111eecf24402e651343f7908c0
Artem Zhmurov [Mon, 28 Oct 2019 15:29:13 +0000 (16:29 +0100)]
Disable GPU update when coordinates swapping is enabled
The coordinates swap is done on the CPU, which would require
additional D2H and H2D copies if the update is offloaded.
Change-Id: I27574fd6b9a3e653b44053dc8bf3f9a8a9501ed2
Berk Hess [Mon, 28 Oct 2019 14:46:20 +0000 (15:46 +0100)]
Add mdrun GPU update documentation
Add update for mdrun -update flag and a release note.
Also corrected some use of node to rank in the user guide.
Change-Id: I6233e4c0fa8c687a9d3cd4c38b372effb70d3e0c
Berk Hess [Tue, 29 Oct 2019 12:33:21 +0000 (13:33 +0100)]
Fix Ewald dipole correction with GPU update
Change-Id: I6b643abec3c1bc74b3e1017420cc03daecb0d4ef
Berk Hess [Mon, 28 Oct 2019 19:24:19 +0000 (20:24 +0100)]
Disable pressure coupling with GPU update
Refs #3182
Change-Id: I7752298a89f7069ae65e198af0c02dad153b4bf0
Paul Bauer [Tue, 29 Oct 2019 13:40:41 +0000 (14:40 +0100)]
Merge "Merge branch 'release-2019' into master"
Paul Bauer [Mon, 28 Oct 2019 15:37:37 +0000 (16:37 +0100)]
Merge branch 'release-2019' into master
Updated reference data that needed updating.
Resolved Conflicts:
cmake/gmxVersionInfo.cmake
docs/CMakeLists.txt
src/gromacs/ewald/pme-gpu-types-host.h
src/gromacs/ewald/pme-only.cpp
src/gromacs/gmxana/gmx_make_ndx.cpp
src/gromacs/mdlib/nbnxn_gpu_common.h
src/gromacs/mdrun/md.cpp
src/programs/mdrun/tests/minimize.cpp
Change-Id: Ic5aeae6bf932d190b95366373b64ee3449f4a630
Alan Gray [Tue, 29 Oct 2019 09:03:05 +0000 (02:03 -0700)]
Explicitly destroy PME-PP GPU communication object
Add code to destroy object when it is no longer required. Even
although object is managed by a unique pointer, this needs to be done
while the GPU context still exists, otherwise a seg fault can occur
when it is automatically destroyed later.
Addresses #3077
Change-Id: I9d6f798d79a73e2ce366c9fb85a0ff9339fc9f88
Artem Zhmurov [Thu, 24 Oct 2019 17:15:40 +0000 (19:15 +0200)]
Switch the GPU buffer ops on when update is on GPU
The update is supported on the GPU only when buffer ops are also
offloaded. This changes the behavior from requiring the
GMX_USE_GPU_BUFFER_OPS to be enabled to it being overriden.
Change-Id: Icdc154daa053f135b0df503697273016a830fb18
Artem Zhmurov [Tue, 22 Oct 2019 06:17:01 +0000 (08:17 +0200)]
Fix missing coordinate D2H copy with GPU update and CPU forces
When update is offloaded to the GPU, but not all forces are offloaded,
the coordinates needed to be copied D2H on the beginning of every step.
This was not done for some cases, e.g. for CPU PME, which lead to
wrong coordinates used for force evaluaion.
The D2H copy call is now split and corresponding calls are moved closer
to the consumers for clarity. The conditional on D2H copy for center of
mass motion removal is made more strict.
Bug was introduced in
a73c3ec2dd9dd64f0c728b7b1d90ac5bcfb246cc
Change-Id: Iebb184dc2e0b5fb68b4a627314d2373391c6ebf9
Artem Zhmurov [Tue, 22 Oct 2019 18:30:22 +0000 (20:30 +0200)]
Add haveCpuLocalForces flag to DomainLifetimeWorkload
The flag haveCpuLocalForces is moved to DomainLifetimeWorkload.
The name and meaning is also slightly changed, haveCpuLocalForceWork
now now indicates whether any forces in the local domain are computed
on the CPU.
Consequently, the currently redundant PME f reduction conditional has
been removed from the haveCpuLocalForceWork initialization. The
relationship between local force work and existence of local forces on
the CPU however needs to be clarified in relationship to #3160 and how
the PP-PME direct communication is improved/simplified.
Change-Id: Idaed4da41bf5b72435c47bab8aeb7130b36b03c7
Szilárd Páll [Wed, 23 Oct 2019 17:17:46 +0000 (19:17 +0200)]
Extend SimulationWorkload with CPU flags
Added flags for PME and Nonbondeds to indicate whether there is CPU
workload; this is useful as the lack of GPU work does not imply the
existence of CPU work.
Also made createSimulationWorkload() take the PME runmode class enum
instead of a bunch of bools.
Made some naming consistency improvements.
Refs #3181
Change-Id: I66233f1c790fc5092fb1babaed2ec3ebf16416de
Artem Zhmurov [Fri, 18 Oct 2019 13:26:36 +0000 (15:26 +0200)]
Allow using update flag
When update flag was re-enabled, the assertion was not removed.
Change-Id: Idd1b3e0d9ff60209918281aea23588f90806b898
M. Eric Irrgang [Thu, 24 Oct 2019 15:12:56 +0000 (18:12 +0300)]
Update bundled pybind.
Update the pybind11 headers from the 2.4.3 tag. Addresses Python C API
updates from the Python 3.8.0 release.
Refs #3175
Change-Id: Ib1380b272aa061f475cd18fa78da5cef131a2998
M. Eric Irrgang [Sat, 26 Oct 2019 08:53:22 +0000 (11:53 +0300)]
Fix CMake message typo.
Verb need 's'.
Change-Id: Iec27a9fe8540c3a8db68f85769f9965937bb0924
Pascal Merz [Thu, 24 Oct 2019 20:04:23 +0000 (14:04 -0600)]
Enable forced rotation for modular simulator
Forced rotation was currently disabled (without being checked for)
with the modular simulator. There is no reason to do so, as the
enforced rotation does not require any new element or communication
between elements.
Change-Id: I057935cbaeaedfde751ca26ba11fb7c990b23efe
M. Eric Irrgang [Thu, 24 Oct 2019 13:57:22 +0000 (16:57 +0300)]
Improve gmxapi mdrun test.
* Tweak the pytest logging output to be more useful.
* Do not test for trajectory file output for MPI workers that had
no ensemble simulation members.
Resolves a false error from test_run_from_read_tpr_op when running
pytest under mpiexec.
Change-Id: Iea601570a51da325cbb5be1615f2fc738af4554c
Berk Hess [Thu, 24 Oct 2019 17:36:45 +0000 (19:36 +0200)]
Add bool useDomainDecomposition in mdrunner
Change-Id: I0a320a33aef2a118a188358f7df6ab0f00aa2e13
Berk Hess [Thu, 24 Oct 2019 17:21:31 +0000 (19:21 +0200)]
Add gmx_mtop_interaction_count()
Change-Id: Ia40990517cc5fa902100ba4d185b3d3211d312cd
Berk Hess [Thu, 24 Oct 2019 17:05:48 +0000 (19:05 +0200)]
Fix initializers in locality.h
Change-Id: I6b0e3066401e5e23a4923e4de87c2e4326d37297
Berk Hess [Wed, 9 Oct 2019 11:57:07 +0000 (13:57 +0200)]
Move locality.h from nbnxm to mdtypes
Removed duplicate definition of AtomLocality from
StatePropagatorDataGpu.
Change-Id: I79aa415dd6fc91791d0cc54dc07d7c56e9b7c874
Berk Hess [Wed, 23 Oct 2019 09:00:24 +0000 (11:00 +0200)]
Use gmx::Range in Nbnxm gridding functions
Change-Id: Ice818946dc78375065797056762acd340921ea70
M. Eric Irrgang [Wed, 16 Oct 2019 15:22:04 +0000 (18:22 +0300)]
Suppress some output when sample_restraint is built with GROMACS.
The install target for the sample_restraint plugin package is
disabled when built for testing as part of a GROMACS build, which made
a CMake status message confusing.
Change-Id: I235f879b50cf04092a8fc40a6c5e4722ffa108bf
ejjordan [Mon, 21 Oct 2019 09:11:47 +0000 (11:11 +0200)]
Make t_nextnb an implementation detail
This change moves the t_nextnb struct completely inside
of gen_excl and gen_pad. This makes changing the underlying
data structure easier.
Change-Id: I643025ff9e572849e7e1aaedab1484a1609fe7db
Artem Zhmurov [Tue, 22 Oct 2019 11:54:55 +0000 (13:54 +0200)]
Slight improvements to decideWhetherToUseGpuForUpdate(...)
1. The boolean is passed for vsites instead of entire mdatoms structure.
2. Arguments are taken as const.
3. Whether or not the position/orientation restraints are enabled is now
taken from topology.
Change-Id: I9d299b5e46c39af07a17af1a639907d1dd11a9bc
Berk Hess [Wed, 23 Oct 2019 09:50:02 +0000 (11:50 +0200)]
Fix illegal memory access in FE calculations
With free-energy calculations not using lambda states, an output
buffer would be accessed one element beyond it's allocated size.
Note that this code should be completely refactored, but not
in a release branch.
Fixes #3173
Change-Id: I677e602ba96c9f64fbf79a626e43c9e590c18bea
Berk Hess [Wed, 23 Oct 2019 08:27:54 +0000 (10:27 +0200)]
Expand documentation for nbnxn_put_on_grid()
Change-Id: Id6349848d9d03215632cf23e59a556f4e40327fb
Pascal Merz [Tue, 22 Oct 2019 22:05:07 +0000 (16:05 -0600)]
Correct for skewed box before domain decomposition (modular simulator)
The correction for skewed boxes before domain decomposition is done
in the legacy code, but was missing in the modular simulator.
Change-Id: I202629c1b546247c64b09e491cfb09db402ca1b1
Alan Gray [Wed, 4 Sep 2019 12:41:21 +0000 (05:41 -0700)]
Enable GPU Peer Access in GPU Utilities
When using the new GPU communication features, enabling peer access
between pairs of GPUs (where supported) will allow peer-to-peer
communications. In this patch the CUDA code to enable peer access is
introduced into central GPU utilities and called from do_md.
Implements #3087
Change-Id: If668366b76d49f7b624eedb501f8af19135c4386
Pascal Merz [Wed, 23 Oct 2019 00:32:47 +0000 (18:32 -0600)]
Fix log writing (modular simulator)
The current implementation would not write energies to log if
- log was not in sync with energy writing, or
- log writing is more frequent than neighborsearching.
This commit fixes these bugs, restoring the legacy behavior. Moving
forward, it could be discussed whether writing to log file and writing
to energy trajectory could be coupled (i.e., should log writing frequency
be a multiple of energy writing?).
Fixes #3151
Change-Id: Icb12ecd7c9aedb29138a9a17fb6d130c4f23a06a
Pascal Merz [Tue, 22 Oct 2019 23:09:21 +0000 (17:09 -0600)]
Adapt enum to style convention (modular simulator)
This capitalizes the values of two enums within the modular simulator
framework (EnergySignallerEvent and TrajectoryEvent) in accordance
with the style conventions.
Change-Id: Iecd8e4ece43eee37ee7b3315ec65d86bb04f1feb
Pascal Merz [Tue, 22 Oct 2019 22:59:36 +0000 (16:59 -0600)]
Small improvement to EnergyElement (modular simulator)
This avoids registering energy and trajectory callbacks on
non-master ranks, as the result is never used. This also renames
isMaster_ to isMasterRank_ for consistency with other uses.
Change-Id: I364bbe73bf7d8ff3c6789d15e54603d3bd7f221b
Artem Zhmurov [Sun, 20 Oct 2019 13:30:30 +0000 (15:30 +0200)]
Remove unnecessary CUDA stream synchronization calls
These calls were needed before the synchronization was re-introduced.
With the event-based synchronization in place, they are no longer
necessary.
Change-Id: I9507432b40962cd49e4fc7374f15530c7fbf2ae7
Pascal Merz [Tue, 22 Oct 2019 17:40:26 +0000 (11:40 -0600)]
Fix stop condition handling (modular simulator)
Fixes a bug that would signal the stop condition repeatedly when
the second SIGINT was received, effectively keeping the simulation
from stopping by repeatedly delaying the last step to the following
step.
Change-Id: Ie0de7b893ff53ccb09e285cb1b84e01594f581d5
Pascal Merz [Tue, 22 Oct 2019 17:35:51 +0000 (11:35 -0600)]
Fix signal propagation (modular simulator)
In the modular simulator, signals between ranks were not reduced by
compute globals. This disabled the processing of SIGINT and might
have influenced checkpointing.
Change-Id: Idebe3c80f796c41cfdb12922c26c4920ca8d9d20
ejjordan [Tue, 22 Oct 2019 14:12:40 +0000 (16:12 +0200)]
Remove unused functions from gmxpreprocess
Change-Id: I1f635dcc2f534d5be0de30eaace32b561da3c13c
Artem Zhmurov [Tue, 22 Oct 2019 13:47:59 +0000 (15:47 +0200)]
Rename havePositionRestraints(...) to haveRestraints(...)
This function checks if there are position, distance or orientation
restraints in the system, so the havePositionRestraints(...) name
is misleading.
Change-Id: I9ffaa1ff68172c2227780492577ab1d8626a23f6
Berk Hess [Tue, 22 Oct 2019 09:33:20 +0000 (11:33 +0200)]
Require -ei for essential dynamics
Essential dynamics sampling was switching on by either the -ei
option or by ED information being present in the checkpoint file.
As this complicates initialization and we should not choose
algorithms based on a checkpoint, it is now only switched on
by the -ei option (but an mdp option is we actually want).
Change-Id: I6e0873a38ee42779d09f66ed7e55c6d1b8bf25da
Christian Blau [Tue, 22 Oct 2019 13:44:15 +0000 (15:44 +0200)]
Fix nst in density-fitting module
An error in the nst variable led to forces from density-guided
simulations to be applied every step, because its internal step
counter would stay at 0.
Change-Id: Ib8b33b407bfe9c56bab497f7e2d505843675afca
refs: #2282
Pascal Merz [Mon, 21 Oct 2019 20:13:46 +0000 (14:13 -0600)]
Fix DomDec step (modular simulator)
Domain decomposition was mistakenly using the initial step for
the entire run with modular simulator.
Change-Id: I1a6a56c91e3e381bf3aa0e1de1d41b16f19a6634
Pascal Merz [Mon, 21 Oct 2019 21:25:58 +0000 (15:25 -0600)]
Fix PmeLoadBalanceHelper initialization (modular simulator)
PmeLoadBalanceHelper could be initialized before the state had a
valid box. Fixed the initialization order and added comments and
assert to avoid this error to creep back in.
Change-Id: I7b59011d6e1d532543a1d1463eceacfcdc148557
ejjordan [Mon, 21 Oct 2019 16:08:07 +0000 (18:08 +0200)]
Remove unused function generate_excls and make clean_excls static
Change-Id: I9bd742da2032f0f7d9ce19bd71df6a87c881e6e1
Szilárd Páll [Mon, 21 Oct 2019 16:26:01 +0000 (18:26 +0200)]
Remove unsupported configs from the gpuupdate matrix
GPU update is not supported with domain decomposition and due to the
fallback paths being triggered these cases would anyway test GPU buffer
ops only which can already be tested with the gpubufferops matrix.
These configs can be re-enabled when the respective use-cases get
enabled in the future.
Change-Id: Ic65e118b05fa21683069fbffa19ad7285c27cdf4
Szilárd Páll [Tue, 15 Oct 2019 17:19:34 +0000 (19:19 +0200)]
Override GMX_GPU_DD_COMMS if nonbondeds are off
It is not useful to have environment variables force dev features
on when a hard dependency, here the nonbonded offload, is not enabled
in a run. This override should make it easier to pass tests.
Change-Id: I58898fc4ea55454d74bc0e7fb1121eab2c66d84c
Artem Zhmurov [Mon, 21 Oct 2019 09:42:14 +0000 (11:42 +0200)]
Update the copyright message
1. Add Artem Zhmurov to the list of contributors.
2. Update the copyright year.
Change-Id: Ic7d19acd58d123e8a5f1a91c3191fbdaa26db787
Artem Zhmurov [Fri, 18 Oct 2019 15:17:06 +0000 (17:17 +0200)]
Add environment variable that changes the meaning of '-update auto'
This change creates 'GMX_FORCE_UPDATE_DEFAULT_GPU', that changes the
default behavior of '-update' option to 'gpu'. Also changed the
gpuupdate Jenkins trigger to set this environment variable.
Refs. #3163.
Change-Id: I4463de3266d97c5f91bac65d3d997cf564e6e880
Alan Gray [Fri, 6 Sep 2019 07:00:12 +0000 (00:00 -0700)]
GPU Coordinate PME/PP Communications
Extends PmePpCommGpu class to provide PP-side support for coordinate
transfers from either GPU or CPU to PME task, and adds new
PmeCoordinateReceiverGpu class to recieve coordinate data directly to
the GPU on the PME task.
Implements part of #2817
Refs TODOs #3157 #3158 #3159
Change-Id: Iefa2bdfd9813282ad8b07feeb7691f16880e61a2
M. Eric Irrgang [Wed, 2 Oct 2019 13:53:52 +0000 (16:53 +0300)]
Add trajectory output to mdrun operation.
Fixes #3144
Change-Id: Ic9e1eb988bc9c934eb8e9cd6009008e6e60d8342
Alan Gray [Sat, 31 Aug 2019 18:20:52 +0000 (11:20 -0700)]
GPU Receive for PME/PP GPU Force Communications
This change extends the PME/PP GPU force communication functionality
to allow the force buffer to be recieved direct to GPU memory on the
PP task.
Implements part of #2817
Refs #3158 #3159
Change-Id: I5b1cff1846c7c3bd966b6bf9c0af72769600ef18
Szilárd Páll [Tue, 15 Oct 2019 10:57:39 +0000 (12:57 +0200)]
Add strict assertions on x synchronizer in PME
To be able to add strict assertions on when a valid event synchronizer
needs to be passed to PME, a new booean is stored in the PME GPU data
structures indicating whether execution is happening on a separate PME
rank.
Also clarified some function argument doxygen.
Change-Id: Ic141c70dded2b57f39b7f2e2bfa1a17d80604204
Christian Blau [Thu, 17 Oct 2019 11:36:13 +0000 (13:36 +0200)]
Manual entry and .mdp option for adaptive force scaling
Adds a manual entry and extends the mdp option section to describe the
adpative force scaling parameters.
refs #2282
Change-Id: Ifb2e4af256282340f1111262f91e5d73d93c3208
Szilárd Páll [Fri, 18 Oct 2019 18:41:21 +0000 (20:41 +0200)]
Fix genion clang-8 warning
Commit
00c05bb switching to clang-tidy 8 was not based on the latest
master and went in without flagging a recently introduced warning.
This fixes the warning also fixing master presubmit.
Change-Id: If193a376f6f0c79bb004ef5775d404d71c5b6f10
Christian Blau [Wed, 16 Oct 2019 14:45:39 +0000 (16:45 +0200)]
genion: prohibit ion placement close to non-solvent
Changes behaviour of genion so that "-rmin" does not only account for
distances between ions but everything that is not in the selected
solvent group.
Avoids unwanted replacement of crystal water ions in proteins and
very close placement of ions to proteins.
Change-Id: Ieb82352fac044ceae2f0d55504dcdf40305b9a61
Paul Bauer [Thu, 17 Oct 2019 15:05:54 +0000 (17:05 +0200)]
Use clang-tidy-8
Change-Id: I208eddb4903d45547c39e9dcb9b45a30753e6958
Artem Zhmurov [Wed, 16 Oct 2019 14:33:31 +0000 (16:33 +0200)]
Remove excessive H2D and D2H coordinates copies when update is offloaded
The H2D copies are only needed:
1. When update is not ofloaded.
2. At the search steps, after device buffers were reinitialized.
The D2H copies are only needed:
1. On the search steps, since the device buffers are reinitialized.
2. If there are CPU consumers, e.g. CPU bondeds.
3. When the energy is computed.
4. When coordinates are needed for output.
There are two special cases, when coordinates are needed on host,
that dealt with separately:
1. When the PME it tuned.
2. When center of mass motion is removed.
The locality of copied atoms when update is offloaded is changed
from All to Local in preparation for multi-GPU case. The blocking sync
on H2D copy event is moved from UpdateConstraints to
StatePropagatorDataGpu.
Change-Id: I971a6273b39fa7da07600312c085ce343b5d25ee
Berk Hess [Fri, 18 Oct 2019 05:39:38 +0000 (07:39 +0200)]
Update grompp cutoff-scheme warnings
As the group cutoff scheme has been removed, it is confusing for
the user to complain about the cutoff-scheme option not being set.
Also updated warning messages that said that things don't work witj
the Verlet cut-off scheme.
Change-Id: Ie73a6cde90ac963c82ed15e51a19169e7eb948f0
Christian Blau [Thu, 17 Oct 2019 12:41:42 +0000 (14:41 +0200)]
Basic refactoring genion and adding test
Adding a test and refactoring genion so that is passes memory checks.
Needed as a base for changed functionality.
Change-Id: Ie6408bb031cb40f6d74faa17a316a81fb8661289
Artem Zhmurov [Wed, 16 Oct 2019 11:33:18 +0000 (13:33 +0200)]
Remove excessive H2D and D2H copies of velocities when update is offloaded
If update is offloaded:
The H2D copy of velocities is done:
1. At the search step after the device buffers were reinitialized.
The D2H copy is done:
1. In the beginning of the step on search steps (to back up before the
device buffers are reinitialized).
2. In the beginning of the velocity output step.
3. After update when globals are computed.
4. After update when temperature is needed for the next step.
The Local locality is used for the copies when update is offloaded in
anticipation of the multi GPU case.
The REMD simulations are now not supported when update is offloaded.
Change-Id: Ifbb9636cafba8980a4a781d942420c5c2c1bcdfd
Christian Blau [Tue, 24 Sep 2019 14:32:28 +0000 (16:32 +0200)]
Adaptive force scaling for densityfitting
Scales the force adaptively for density-guided-simulations
refs #2282
Change-Id: I96310f498cf2fae9f7385a9396f1253b760d135e
Szilárd Páll [Thu, 17 Oct 2019 14:20:29 +0000 (16:20 +0200)]
Fix clang-tidy warnings in gpuhaloexchange_impl.cuh
Change-Id: Ifdbe7809c1f37808ed29d89253d1fb786c0c5ae7
Szilárd Páll [Tue, 15 Oct 2019 22:44:59 +0000 (00:44 +0200)]
Eliminate spurious GPU->CPU copy
When GPU direct communication-based halo exchange is used, non-local
forces should not be copied back to the CPU prior to halo exchange.
This commit removes the incorrectly unconditional leftover copy.
Fixes/Improves #2890
Refs #3156
Change-Id: I25e521204f30da1e257232e9117c3fe4f0a83b08
Szilárd Páll [Tue, 15 Oct 2019 00:47:21 +0000 (02:47 +0200)]
Disable GMX_GPU_PME_PP_COMMS if PME is not offloaded
Change-Id: I54ba9015fd95a1c74bdcb23257a6d2b40dd4872d
Christian Blau [Tue, 24 Sep 2019 14:33:13 +0000 (16:33 +0200)]
Exponential moving average
Evaluate the exponential moving average of a quantity.
Change-Id: I80747f7a76366ba64b2c4e1bd149834644d7709b
Szilárd Páll [Tue, 15 Oct 2019 23:30:12 +0000 (01:30 +0200)]
Fix initialization of field in PME data structure
The bPPRank field that indicated whether the rank had PP duryt as well
was not always initialized leading to a spurious trigger in a recently
added assertion.
Change-Id: Id9700c4318e01781da0a0c63323025f2525e58b6
M. Eric Irrgang [Thu, 12 Sep 2019 07:49:47 +0000 (10:49 +0300)]
Minor tidying to mdrun.py
Rearrange and annotate some code in gmxapi.simulation.mdrun prior to
functional additions.
Change-Id: If9efefce8f75681fbaa1c615d96e355a5650eedc
M. Eric Irrgang [Thu, 26 Sep 2019 12:17:49 +0000 (15:17 +0300)]
Minor clean-up to gmxapi package.
* Resolve some linting warnings and potential import ambiguities.
* Apply recommended pybind iteration mechanism for Python objects.
* Remove some extraneous syntax and imports.
* Make explicit a call that relied on positional argument processing.
* Alias an imported name (Future) to avoid namespace collision.
* Normalize initialization of DataSourceCollection.
* Make sure that initialization at construction uses the same setter
code path as subscripted assignment on the object instances.
* Add ResultDescription to gmxapi.operation.Future.
Change-Id: Ia687929302edd85a0af616b1d947db21e2f3876e
Artem Zhmurov [Mon, 14 Oct 2019 20:58:08 +0000 (22:58 +0200)]
Link GPU force producer and consumer tasks
The GPU event synchronizer that indicates that forces are ready
for a consumption is now passed to the GPU update-constraints.
The update-constraints enqueue a wait on the event in the update
stream before performing numerical integration and constraining.
Note that the event is conditionally returned by the
StatePropagatorDataGpu and indicates that either the reduction of
forces on the GPU or the H2D copy is done, depending on offload
scenario on a current timestep.
Refs. #2816, #2888, #3126.
Change-Id: Ic12b0c55b75ec5f0c31ce500a2760fb4d5cf3b91
M. Eric Irrgang [Sat, 12 Oct 2019 13:12:48 +0000 (16:12 +0300)]
Update gmxapi.version module.
* Improve documentation about `has_feature` optional exception behavior.
* Normalize to FeatureNotAvailableError.
* Describe transitions of named features into the version specification.
Refs #3130
Change-Id: Iea6ba1b9cc3cf1b1ef6de71cd8ae5d9c593146c3
Szilárd Páll [Fri, 27 Sep 2019 17:21:01 +0000 (19:21 +0200)]
Make the wait on nonbonded GPU results conditional
When the force reduction is done on the GPU and there are no energy or
shift force results required, there is no need to block and wait on the
CPU until the GPU nonbonded kernels complete.
This change makes the wait conditional on whether there are nonbonded
force, energy or shift force outputs so the blocking wait is now skipped
with GPU buffer ops on force-only steps.
Also removed the now unnecessary boolean argument passed to
gpu_launch_cpyback().
Refs #3128
Change-Id: Ic1285f5a00ac910cd1d6c4358f41f2c7c41dea4c
Szilárd Páll [Mon, 14 Oct 2019 17:19:03 +0000 (19:19 +0200)]
Trigger synchronizer when local forces are ready
The sycnhronizer is created and managed in StatePropagatorDataGpu and is
passed to the nonbonded mdoule at the f buffer ops init.
Refs #2888 #3126
Change-Id: Ie9bf0b6cd8511fe282e377e48f3940e591db214c
Alan Gray [Sun, 25 Aug 2019 19:42:47 +0000 (12:42 -0700)]
PME/PP GPU Pimpl Class and GPU->CPU Communication for force buffer
Activate with GMX_GPU_PME_PP_COMMS env variable
Implements new pimpl class for PME-PP GPU communications. Performs
scatter of force buffer data from PME task GPU buffer to PP task CPU
buffers directly using CUDA memory copies. Requires thread MPI to be
in use.
Implements part of #2891
Change-Id: I0181ff67065c75f20cddc361f695df9bf888cd88
Artem Zhmurov [Mon, 14 Oct 2019 21:36:20 +0000 (23:36 +0200)]
Fix the single-GPU update-constraints
This is a temporary fix to make it work. Better solutions are in other patches.
1. The getter for the update stream returned the stream itself instead of a
pointer to it.
2. The copy stream for forces with AtomLocality:All set to updateStream.
Change-Id: I02b15beddebc160f2fe4fc21da64975977855699
M. Eric Irrgang [Mon, 14 Oct 2019 09:27:30 +0000 (12:27 +0300)]
Fix logic error in StaticSourceManager intialization.
`dict` inputs are Iterable, but are not the sort of sequence type that
we were trying to catch as ambiguous (in terms of data shape).
Change-Id: I71b79c7389197a0750d21874bf6f7cb3fef7721b
Szilárd Páll [Thu, 10 Oct 2019 16:10:38 +0000 (18:10 +0200)]
Link GPU coordinate producer and consumer tasks
The event synchronizer indicating that coordinates are ready in the GPU
is now passed to the two tasks that depend on this input: PME and
X buffer ops. Both enqueue a wait on the passed event prior to kernel
launch to ensure that the coordinates are ready before the kernels
start executing.
On the separate PME ranks and in tests, as we use a single stream,
no synchronization is necessary.
With the on-device sync in place, this change also removes the
streamSynchronize call from copyCoordinatesToGpu.
Refs. #2816, #3126.
Change-Id: I3457f01f44ca6d6ad08e0118d8b1def2ab0b381b
M. Eric Irrgang [Thu, 26 Sep 2019 12:35:26 +0000 (15:35 +0300)]
Add note about libpython requirement to gmxapi install docs.
Change-Id: I0f6a3187412a6257b1b981c91e29954fbec46120
M. Eric Irrgang [Thu, 26 Sep 2019 12:34:20 +0000 (15:34 +0300)]
Add generic check for gmxapi support library to Python package build.
Change-Id: I5a84407665d4f66b97f02338126a620a0c88ffb4
Berk Hess [Mon, 14 Oct 2019 16:30:43 +0000 (18:30 +0200)]
Const correctness for genhydro
Change-Id: I59f018a817046202209c37905e5cfb6dd359073b
Paul Bauer [Mon, 14 Oct 2019 10:00:06 +0000 (12:00 +0200)]
Fix bug in preprocessing
Use of wrong vector caused out of bounds access.
Fixes #3129
Change-Id: I913c850a638f80d9d8d24708dc30170f448bc36c
Artem Zhmurov [Thu, 10 Oct 2019 16:48:30 +0000 (18:48 +0200)]
Add separate constructor to StatePropagatorDataGpu for PME-only rank / PME tests
A separate constructor is added to the StatePropagatorDataGpu to use in the
separate PME rank and in PME tests. These use the provided stream to copy
coordinates for atom with Local or All localities. Copy of coordinates for
non-local particles as well as copy operations for the forces and velocities
are not allowed by assertions.
Refs. #3126.
Change-Id: I66aeeaea54931398b1a4a30b920b092f7d40ae16
M. Eric Irrgang [Sun, 13 Oct 2019 12:11:08 +0000 (15:11 +0300)]
Fix logic error in gmxapi.version.api_is_at_least().
Change-Id: I15707471a350ba908d0725e54a097bfa288fcdb7
Mark Abraham [Sun, 13 Oct 2019 12:58:58 +0000 (14:58 +0200)]
Ensure environment variable use does not give warnings
The ARM GCC 5 builds give false positive warnings that we should be
using the values returned from getenv.
Change-Id: I64734098d6a030124d75051905f0c96dc497c00b
Szilárd Páll [Thu, 10 Oct 2019 20:20:40 +0000 (22:20 +0200)]
Enable StatePropagatorGpuData for force transfers
Force transfers have been switched to use StatePropagatorGpuData already
before. This change updates the synchronization mechanisms as:
- replaces the previous stream sync after GPU buffer/ops reduction with
a waitForcesReadyOnHost call;
- removes the barriers in copyForces[From|To]Gpu() as dependencies
are now satisfied: most dependencies are intra-stream and therefore
implicit, the exception being the halo exchange that uses its own
mechanism to sync H2D in the local stream with the nonlocal stream
(which is yet to be replaces Refs #3093).
Refs. #3126.
Change-Id: I8bfd39f79c87f20492c4ae287d6f19261724f806
M. Eric Irrgang [Wed, 17 Jul 2019 11:43:26 +0000 (14:43 +0300)]
Update sample_restraint documentation entry points.
TODO:
* After beta 2, we can add some new permalinks for web-based
* there are a few other "to do"s noted in the updated README files.
Change-Id: Ib875cef05ab9351237a6da99c4bd1ae8d32994d5
Szilárd Páll [Wed, 9 Oct 2019 00:53:24 +0000 (02:53 +0200)]
Move buffer ops / PME F reduction flags into StepWorkload
Also moved overrides conditions for when buffer ops can not be offloaded
into the DevelopmentFeatureFlags data structure initialization, the
initialization of which had to be shifted so this code can be passed the
task assigment decision on nonbonded offload.
Change-Id: Ib6850bcf306a70bbd9557cf2d5c2b1e39159e566
M. Eric Irrgang [Thu, 10 Oct 2019 10:28:47 +0000 (13:28 +0300)]
Test float and int gmxapi parameter updates.
Also updates the spc_water_box test fixture resource to specify an
exactly representable time step.
Change-Id: I5691be2041cd769585b98638ba4b29473752ba1d
M. Eric Irrgang [Wed, 11 Sep 2019 07:49:26 +0000 (10:49 +0300)]
Usability updates for gmxapi/docs Docker image.
* Allow specifying a tagged version of gmxapi/ci-mpich to build on.
Updates like Change-Id: I9f67d46aa93fb789825cb78cc2a95b24f7dfccaa
cause incompatibilities between docs.dockerfile and its base image.
* Update recommended docker command line. The `-t` flag is not
necessary, and may allow spurious terminal control signals to confuse
the Apache web server, causing it to exit with a message like
`caught SIGWINCH, shutting down gracefully`
Change-Id: I27e7af82bf9ab252e3b30a803d4df36187579d4b
Artem Zhmurov [Thu, 10 Oct 2019 16:03:11 +0000 (18:03 +0200)]
Centralize management of forces ready on GPU event
This change adds the GpuEventSynchronizer for the forces reduced on GPU event
to the StatePropagatorDataGpu. This event should be marked if the buffer ops
are offloaded when the force reduction is done. The consumers of of the forces
on the GPU will get this event or the event on the H2D copy is done,
depending on the current step workload and offload scenario.
Change-Id: Ib559dbed5ad777eac3a906e4ee0ebaa07caf0ac1
Artem Zhmurov [Thu, 10 Oct 2019 17:28:50 +0000 (19:28 +0200)]
Change copy mode to Async in PME only rank
The H2D copy in PME-only rank was set to synchronous mode by
mistake. This patch corrects that.
Change-Id: I621b03e574b23d2933b4940a20d234e37b57408d