M. Eric Irrgang [Thu, 12 Sep 2019 07:49:47 +0000 (10:49 +0300)]
Minor tidying to mdrun.py
Rearrange and annotate some code in gmxapi.simulation.mdrun prior to
functional additions.
Change-Id: If9efefce8f75681fbaa1c615d96e355a5650eedc
M. Eric Irrgang [Thu, 26 Sep 2019 12:17:49 +0000 (15:17 +0300)]
Minor clean-up to gmxapi package.
* Resolve some linting warnings and potential import ambiguities.
* Apply recommended pybind iteration mechanism for Python objects.
* Remove some extraneous syntax and imports.
* Make explicit a call that relied on positional argument processing.
* Alias an imported name (Future) to avoid namespace collision.
* Normalize initialization of DataSourceCollection.
* Make sure that initialization at construction uses the same setter
code path as subscripted assignment on the object instances.
* Add ResultDescription to gmxapi.operation.Future.
Change-Id: Ia687929302edd85a0af616b1d947db21e2f3876e
Artem Zhmurov [Mon, 14 Oct 2019 20:58:08 +0000 (22:58 +0200)]
Link GPU force producer and consumer tasks
The GPU event synchronizer that indicates that forces are ready
for a consumption is now passed to the GPU update-constraints.
The update-constraints enqueue a wait on the event in the update
stream before performing numerical integration and constraining.
Note that the event is conditionally returned by the
StatePropagatorDataGpu and indicates that either the reduction of
forces on the GPU or the H2D copy is done, depending on offload
scenario on a current timestep.
Refs. #2816, #2888, #3126.
Change-Id: Ic12b0c55b75ec5f0c31ce500a2760fb4d5cf3b91
M. Eric Irrgang [Sat, 12 Oct 2019 13:12:48 +0000 (16:12 +0300)]
Update gmxapi.version module.
* Improve documentation about `has_feature` optional exception behavior.
* Normalize to FeatureNotAvailableError.
* Describe transitions of named features into the version specification.
Refs #3130
Change-Id: Iea6ba1b9cc3cf1b1ef6de71cd8ae5d9c593146c3
Szilárd Páll [Fri, 27 Sep 2019 17:21:01 +0000 (19:21 +0200)]
Make the wait on nonbonded GPU results conditional
When the force reduction is done on the GPU and there are no energy or
shift force results required, there is no need to block and wait on the
CPU until the GPU nonbonded kernels complete.
This change makes the wait conditional on whether there are nonbonded
force, energy or shift force outputs so the blocking wait is now skipped
with GPU buffer ops on force-only steps.
Also removed the now unnecessary boolean argument passed to
gpu_launch_cpyback().
Refs #3128
Change-Id: Ic1285f5a00ac910cd1d6c4358f41f2c7c41dea4c
Szilárd Páll [Mon, 14 Oct 2019 17:19:03 +0000 (19:19 +0200)]
Trigger synchronizer when local forces are ready
The sycnhronizer is created and managed in StatePropagatorDataGpu and is
passed to the nonbonded mdoule at the f buffer ops init.
Refs #2888 #3126
Change-Id: Ie9bf0b6cd8511fe282e377e48f3940e591db214c
Alan Gray [Sun, 25 Aug 2019 19:42:47 +0000 (12:42 -0700)]
PME/PP GPU Pimpl Class and GPU->CPU Communication for force buffer
Activate with GMX_GPU_PME_PP_COMMS env variable
Implements new pimpl class for PME-PP GPU communications. Performs
scatter of force buffer data from PME task GPU buffer to PP task CPU
buffers directly using CUDA memory copies. Requires thread MPI to be
in use.
Implements part of #2891
Change-Id: I0181ff67065c75f20cddc361f695df9bf888cd88
Artem Zhmurov [Mon, 14 Oct 2019 21:36:20 +0000 (23:36 +0200)]
Fix the single-GPU update-constraints
This is a temporary fix to make it work. Better solutions are in other patches.
1. The getter for the update stream returned the stream itself instead of a
pointer to it.
2. The copy stream for forces with AtomLocality:All set to updateStream.
Change-Id: I02b15beddebc160f2fe4fc21da64975977855699
M. Eric Irrgang [Mon, 14 Oct 2019 09:27:30 +0000 (12:27 +0300)]
Fix logic error in StaticSourceManager intialization.
`dict` inputs are Iterable, but are not the sort of sequence type that
we were trying to catch as ambiguous (in terms of data shape).
Change-Id: I71b79c7389197a0750d21874bf6f7cb3fef7721b
Szilárd Páll [Thu, 10 Oct 2019 16:10:38 +0000 (18:10 +0200)]
Link GPU coordinate producer and consumer tasks
The event synchronizer indicating that coordinates are ready in the GPU
is now passed to the two tasks that depend on this input: PME and
X buffer ops. Both enqueue a wait on the passed event prior to kernel
launch to ensure that the coordinates are ready before the kernels
start executing.
On the separate PME ranks and in tests, as we use a single stream,
no synchronization is necessary.
With the on-device sync in place, this change also removes the
streamSynchronize call from copyCoordinatesToGpu.
Refs. #2816, #3126.
Change-Id: I3457f01f44ca6d6ad08e0118d8b1def2ab0b381b
M. Eric Irrgang [Thu, 26 Sep 2019 12:35:26 +0000 (15:35 +0300)]
Add note about libpython requirement to gmxapi install docs.
Change-Id: I0f6a3187412a6257b1b981c91e29954fbec46120
M. Eric Irrgang [Thu, 26 Sep 2019 12:34:20 +0000 (15:34 +0300)]
Add generic check for gmxapi support library to Python package build.
Change-Id: I5a84407665d4f66b97f02338126a620a0c88ffb4
Berk Hess [Mon, 14 Oct 2019 16:30:43 +0000 (18:30 +0200)]
Const correctness for genhydro
Change-Id: I59f018a817046202209c37905e5cfb6dd359073b
Paul Bauer [Mon, 14 Oct 2019 10:00:06 +0000 (12:00 +0200)]
Fix bug in preprocessing
Use of wrong vector caused out of bounds access.
Fixes #3129
Change-Id: I913c850a638f80d9d8d24708dc30170f448bc36c
Artem Zhmurov [Thu, 10 Oct 2019 16:48:30 +0000 (18:48 +0200)]
Add separate constructor to StatePropagatorDataGpu for PME-only rank / PME tests
A separate constructor is added to the StatePropagatorDataGpu to use in the
separate PME rank and in PME tests. These use the provided stream to copy
coordinates for atom with Local or All localities. Copy of coordinates for
non-local particles as well as copy operations for the forces and velocities
are not allowed by assertions.
Refs. #3126.
Change-Id: I66aeeaea54931398b1a4a30b920b092f7d40ae16
M. Eric Irrgang [Sun, 13 Oct 2019 12:11:08 +0000 (15:11 +0300)]
Fix logic error in gmxapi.version.api_is_at_least().
Change-Id: I15707471a350ba908d0725e54a097bfa288fcdb7
Mark Abraham [Sun, 13 Oct 2019 12:58:58 +0000 (14:58 +0200)]
Ensure environment variable use does not give warnings
The ARM GCC 5 builds give false positive warnings that we should be
using the values returned from getenv.
Change-Id: I64734098d6a030124d75051905f0c96dc497c00b
Szilárd Páll [Thu, 10 Oct 2019 20:20:40 +0000 (22:20 +0200)]
Enable StatePropagatorGpuData for force transfers
Force transfers have been switched to use StatePropagatorGpuData already
before. This change updates the synchronization mechanisms as:
- replaces the previous stream sync after GPU buffer/ops reduction with
a waitForcesReadyOnHost call;
- removes the barriers in copyForces[From|To]Gpu() as dependencies
are now satisfied: most dependencies are intra-stream and therefore
implicit, the exception being the halo exchange that uses its own
mechanism to sync H2D in the local stream with the nonlocal stream
(which is yet to be replaces Refs #3093).
Refs. #3126.
Change-Id: I8bfd39f79c87f20492c4ae287d6f19261724f806
M. Eric Irrgang [Wed, 17 Jul 2019 11:43:26 +0000 (14:43 +0300)]
Update sample_restraint documentation entry points.
TODO:
* After beta 2, we can add some new permalinks for web-based
* there are a few other "to do"s noted in the updated README files.
Change-Id: Ib875cef05ab9351237a6da99c4bd1ae8d32994d5
Szilárd Páll [Wed, 9 Oct 2019 00:53:24 +0000 (02:53 +0200)]
Move buffer ops / PME F reduction flags into StepWorkload
Also moved overrides conditions for when buffer ops can not be offloaded
into the DevelopmentFeatureFlags data structure initialization, the
initialization of which had to be shifted so this code can be passed the
task assigment decision on nonbonded offload.
Change-Id: Ib6850bcf306a70bbd9557cf2d5c2b1e39159e566
M. Eric Irrgang [Thu, 10 Oct 2019 10:28:47 +0000 (13:28 +0300)]
Test float and int gmxapi parameter updates.
Also updates the spc_water_box test fixture resource to specify an
exactly representable time step.
Change-Id: I5691be2041cd769585b98638ba4b29473752ba1d
M. Eric Irrgang [Wed, 11 Sep 2019 07:49:26 +0000 (10:49 +0300)]
Usability updates for gmxapi/docs Docker image.
* Allow specifying a tagged version of gmxapi/ci-mpich to build on.
Updates like Change-Id: I9f67d46aa93fb789825cb78cc2a95b24f7dfccaa
cause incompatibilities between docs.dockerfile and its base image.
* Update recommended docker command line. The `-t` flag is not
necessary, and may allow spurious terminal control signals to confuse
the Apache web server, causing it to exit with a message like
`caught SIGWINCH, shutting down gracefully`
Change-Id: I27e7af82bf9ab252e3b30a803d4df36187579d4b
Artem Zhmurov [Thu, 10 Oct 2019 16:03:11 +0000 (18:03 +0200)]
Centralize management of forces ready on GPU event
This change adds the GpuEventSynchronizer for the forces reduced on GPU event
to the StatePropagatorDataGpu. This event should be marked if the buffer ops
are offloaded when the force reduction is done. The consumers of of the forces
on the GPU will get this event or the event on the H2D copy is done,
depending on the current step workload and offload scenario.
Change-Id: Ib559dbed5ad777eac3a906e4ee0ebaa07caf0ac1
Artem Zhmurov [Thu, 10 Oct 2019 17:28:50 +0000 (19:28 +0200)]
Change copy mode to Async in PME only rank
The H2D copy in PME-only rank was set to synchronous mode by
mistake. This patch corrects that.
Change-Id: I621b03e574b23d2933b4940a20d234e37b57408d
M. Eric Irrgang [Wed, 25 Sep 2019 10:04:39 +0000 (13:04 +0300)]
Improve parameter typing for gmxapi simulation parameter setter.
* Use conventional simple built-in types for simulation parameter
setter.
* Resolve a sphinx warning due to unrecognized documented parameter
type.
* Make sure we test both integer and floating point parameter setting.
Change-Id: I3c51bbeaf8a5145690e8000b159099476ff53be6
Berk Hess [Wed, 9 Oct 2019 18:07:50 +0000 (20:07 +0200)]
Add class FixedCapacityVector
This satisfied the need for an std::vector like container
without the overhead of dynamic allocation.
Change-Id: Ic12f958a42e736ccd1d0a69d2a9de6056a4b7ff5
Artem Zhmurov [Wed, 9 Oct 2019 14:02:36 +0000 (16:02 +0200)]
Centralize management of coordinates ready on GPU event
This change shifts the ownership of the EventSynchronizer
indicating that coordinates are available on the device from
UpdateConstraints to StatePropagatorDataGpu.
The latter now provides a getter that, based on workload flags, decides
which synchronizer to return. If update is offloaded the returned event
records the completion of the update-constraints on the GPU.
When update is not offloaded and on search steps the returned event
records the completion of the host to device coordinate transfer.
Change-Id: Idec9f46e78a2708fa31e895bf7590cdad1872987
Artem Zhmurov [Fri, 11 Oct 2019 11:48:35 +0000 (13:48 +0200)]
Wrong locality was used when ignoring the PME force contribution
The PME forces have no no-local contribution, hence they should be
ignored when non-local forces are reduced. This fixes the bug with
local PME forces ignored instead of non-local. The bug was
introduced in
9409fa5f.
Change-Id: Ie27c71d1f51ca4022f196a9fa114af42685fcbcf
Paul Bauer [Tue, 17 Sep 2019 15:33:13 +0000 (17:33 +0200)]
Add helper to reuse generated TPR files in testing
Used static class members in GoogleTest and provided option in
testfilemanager to allow file path specification before test case is
started.
This should speed up some of the test cases that have been slow due to
repeated calls to grompp.
Change-Id: I50e29d04550d78f2324e3665e903d45515464298
Szilárd Páll [Wed, 9 Oct 2019 00:32:11 +0000 (02:32 +0200)]
Migrate code to using SimulationWorkload flags
This changes boolean flags to use the ones defined after task assignment
in SimulationWorkload.
Also does some minor cleanup around code touched in do_force().
No functionality should change.
Change-Id: Ia9bd28f84fee4cb98b578b366153646b2959c1f4
Szilárd Páll [Thu, 10 Oct 2019 14:17:58 +0000 (16:17 +0200)]
Remove incorrect GPU task dependency
The depndency between nonlocal reduction and PME forces is not necessary
and therefore it is removed.
Change-Id: Ic7eba307b519f0fb8bde24a2cf676a22d64065ba
Szilárd Páll [Thu, 10 Oct 2019 19:56:16 +0000 (21:56 +0200)]
Fix conditions for building StatePropagatorDataGpu
StatePropagatorDataGpu was only build when GPU update was active rather
than when GPU buffer ops is active.
Change-Id: I6d2efa73c4896a293155c0853a32b8cedf3d23a6
Szilárd Páll [Wed, 9 Oct 2019 18:18:44 +0000 (20:18 +0200)]
Construct a struct with development feature flags
Added a struct that collects development feature flags pupulated using
the same environment variables that were used previsouly.
With this the change also eliminates repeated getenv queries.
Change-Id: I2cb4f4dfc28bb6370bb4f7cb549f2222e7580e47
Szilárd Páll [Tue, 8 Oct 2019 01:47:46 +0000 (03:47 +0200)]
Pass list of EventSynchronizers to GPU reduction
Instead of passing events one-by-one or doing implicit synchronization
hidden in a force tasks' internals, a list of synchronizer objects that
signal the availability of forces from the source tasks is assembled and
passed to the GPU buffer ops/reduction.
The new dependency added here is the H2D transfer for CPU-side force
contribution, PME-related event is kept, halo exchange dependency needs
to be added.
Change-Id: I5ca5ef919b49508790e813e0469aaeef4f6484c0
Paul Bauer [Mon, 7 Oct 2019 09:36:04 +0000 (11:36 +0200)]
Basic population of simulation workload
Adds basic population of simulation workload datastructure from
information available after task assignment is done to GPU(s).
Also changes population of step and domain work datastructures to
return new datastructure instead of passing pointer to routine that
fills them.
Change-Id: Ie48a4f74eaa69092f24355e19f66a90d6700baa0
Artem Zhmurov [Fri, 4 Oct 2019 15:48:34 +0000 (17:48 +0200)]
Add management for velocities and forces copy events to StatePropagatorDataGpu
All H2D and D2H copies of velocities and forces now record an event, methods
to synchronize on those events are added to the class.
Change-Id: I910c5834d83f317f12c1fe0cd71ced168f412386
Szilárd Páll [Mon, 30 Sep 2019 13:04:10 +0000 (15:04 +0200)]
Add management for coordinates copy events into StatePropagatorDataGpu
The coordinate copies are now assign a GPU stream and fire an event
when done. The consumers can now wait on coordinates to be ready on
Host or get the GPU event to enqueue a wait on Device.
Change-Id: Ia33e366f32d777ec980940ff7e284ab0b3498637
Szilárd Páll [Mon, 30 Sep 2019 12:47:11 +0000 (14:47 +0200)]
Add GPU event to mark that coordinate update is done
Also added related getter of a pointer to the GpuEventSynchronizer
object that allows syncing the dependent tasks.
Added temporary CPU blocking wait that ensures previous behaviour is
maintained while the GPU update integration is pending.
Change-Id: Iaae03d4f1f4083863a359e66a35f44d178b9382d
Berk Hess [Mon, 7 Oct 2019 09:59:56 +0000 (11:59 +0200)]
Make nbnxm a proper module
Added definition of module_nbnxm to doxygen documentation.
Required removing the path from many #includes inside
the module.
Change-Id: I1d46aefaa83b0be674c14a4ef5ee16573eee0e1a
Berk Hess [Tue, 8 Oct 2019 08:29:35 +0000 (10:29 +0200)]
Simplify ewald_LRcorrection() call signature
Also removed an outdated comment.
Change-Id: I09af7185f3475c70eab154974f0dba0653a6142a
M. Eric Irrgang [Sun, 6 Oct 2019 14:41:45 +0000 (17:41 +0300)]
Update handling of CMake policy 68..
The policy is now available in supported CMake versions, so apply it
universally.
Change-Id: Ia6ae999f25d7d9b3932fe74862825a8fb8527352
Artem Zhmurov [Tue, 8 Oct 2019 05:24:51 +0000 (07:24 +0200)]
Fix for the recent nightly failure
The cont qualifier in the function declaration produced a warning
that caused the nightly to fail. This fixes the issue, introduced
by
77857c5.
Change-Id: Icef063107a5f66d841f1d110dc9f9fb807364ed2
Prashanth Kanduri [Mon, 7 Oct 2019 15:19:13 +0000 (17:19 +0200)]
Fused two if branches with the same conditional
Change-Id: I8a1fc90875982f2723d9b2f2ea2d847a33f5866b
Artem Zhmurov [Fri, 27 Sep 2019 20:29:30 +0000 (22:29 +0200)]
Pass the GPU streams to StatePropagatorDataGpu constructor
Now the StatePropagatorDataGpu has a local copy of all GPU streams and
manages the update stream. This will allow to select the specific stream
for a specific copy event in the follow-ups. The update stream is now
created in the constructor of the StatePropagatorDataGPU object, which
is a temporary solution until there is a separate device stream manager
(#3115).
Notes:
- The current implementation where StatePropagatorDataGpu is also used
on PME-only ranks, where many of the streams do not exist, without
any restriction on the methods which would require these streams is a
weakness of the design that will be dealt with in follow-up
- The OpenCL builds unconditionally use PME stream/context, since for
these this object is only used when the initial coordinates are copied.
- The update stream is created in the constructor, whereas the rest of
the streams is passed as arguments. This asymmentry will be removed
with introduction of the centralized management of context/streams.
Refs. #2816.
Change-Id: Ia9b1cabd1d3d4942dba8465c716bf644037581e7
Mark Abraham [Fri, 30 Aug 2019 16:27:05 +0000 (18:27 +0200)]
Extend GPU traits class
Now GPU traits provide a non-GPU header, so that generic code
can use CommandStream, CommandEvent and DeviceContext types.
The header also diverges to a platform-specific version when
needed upon compilation. This change allows for passing the
variables of the above types in the general (non-GPU) parts
of the code and can be included where the code is shared
between different platforms.
Renamed a Context variable to DeviceContext for greater clarity.
Change-Id: If21b9dacac66ff7203948eb03de96f9473b7359a
Artem Zhmurov [Mon, 7 Oct 2019 08:47:34 +0000 (10:47 +0200)]
Clean up the comments sections in Update-Constrain
Remove todos that are done, update the comments related to
the recent changes in the logic.
Change-Id: Ib26848b22efae77d6db0a2ab85734ceaf02db235
M. Eric Irrgang [Sun, 6 Oct 2019 14:39:15 +0000 (17:39 +0300)]
Update rpath comments in src/api/cpp
Change-Id: I7839ee47ec8af7fe9445894f357b10b2ac803338
Artem Zhmurov [Wed, 2 Oct 2019 16:15:32 +0000 (18:15 +0200)]
Fix a bug in GPU halo exchange initialization
The non-local stream was passed to a constructor as a local stream.
Change-Id: I4e5b2405b0beb8e64720b7fc6be72ff15e1847fb
Mark Abraham [Sat, 28 Sep 2019 08:25:41 +0000 (10:25 +0200)]
Fix that tests intended for multiple ranks run that way
Added the flag that triggers ctest to use the flags that will produce
multiple ranks in both thread-MPI and real MPI cases.
Change-Id: I4ef3af4fd1750ab7cc231a29191fc4042385c309
Berk Hess [Fri, 4 Oct 2019 07:43:19 +0000 (09:43 +0200)]
Fix random energy and virial with PME on GPU on a PME-only rank
Missing zero initialization of never computed LJ PME energy and virial
terms with PME on the GPU could lead to random energy, virial and
pressure numbers.
The effect of this bug was that the potential and total energy could
be off (not the Coulomb mesh energy). This didn't affect sampling.
The pressure could be off, which would affect sampling when pressure
coupling is used, but likely the simulation would explode after
a few steps.
Fixes #3120
Change-Id: I309dde958f1b73e7f71f87f4f5ad016d16f8d16b
Artem Zhmurov [Wed, 2 Oct 2019 16:00:54 +0000 (18:00 +0200)]
Fix the bug with offloading the PME-only ranks to GPU
This fixes the bug introduced with StatePropagatorDataGpu (Commit
092a8f68):
In PME-only ranks, the GPU context and stream were mixed up and passed to
the constructor of StatePropagatorDataGpu in the wrong order. This caused
failure in OpenCL builds with error -34 (CL_INVALID_CONTEXT). Did not
affect the CUDA builds, since the context is not relevant in CUDA.
Change-Id: I5e070e361c7bdef3168887dcac4869bb71c6c5ed
Berk Hess [Wed, 2 Oct 2019 09:28:06 +0000 (11:28 +0200)]
Fix GPU atom data init timer issue
The GPU atom data init timer was read conditionally on the timing
of the local pairlist transfer. But the local pairlist transfer
is not timed with an empty list, leading to an inconsistent timer
state.
Change-Id: Ifc2a63c7273ae65ae66708c6a8b0fb526041ee38
Berk Hess [Wed, 25 Sep 2019 15:15:42 +0000 (17:15 +0200)]
Use gmx::Range for iZones in pair search
Change-Id: Ifed1ac3ed2fc0b02680a33d4e44620f82248dda9
M. Eric Irrgang [Wed, 11 Sep 2019 08:56:52 +0000 (11:56 +0300)]
Build gmxapi documentation in main GROMACS documentation.
* Update gmxapi Python package installation instructions and remove
quickstart.rst
* Minor updates to main installation guide document.
Refs #2698
Refs #2985
Change-Id: I6f1d6e6fe59e618144e4f14c0d2fe9f9b8c2c901
Berk Hess [Tue, 1 Oct 2019 22:17:42 +0000 (00:17 +0200)]
Fix setting maxNumColumns in Nbnxm::GridSet
The use of unitialized data likely only led to suboptimal GPU usage.
Change-Id: Ib63c3f039da85dd219226cc6abe9f83cfd97e116
Carsten Kutzner [Wed, 2 Oct 2019 14:09:45 +0000 (16:09 +0200)]
Update the www link to the Doxygen documentation in the dev guide
The doxygen manual seems to have been relocated from
http://www.stack.nl/~dimitri/doxygen/manual/index.html
to http://www.doxygen.nl/manual/
Change-Id: I321072065576c52a104a119abb5e05dc90eece3a
Artem Zhmurov [Fri, 20 Sep 2019 08:34:25 +0000 (10:34 +0200)]
Eliminate D2D copy in update constraints
The intermediate coordinates (x' or xp) are only needed inside
the update-constraints module (for the constraints algorithms)
and never used outside. Hence, the xp variable can be used to
save the coordinates before update, while x stores the final
coordinates. This way, there is no need to make a D2D xp->x
copy after applying the constraints, since x will have the
correct data.
Refs. #2888, #3114.
Change-Id: I363b633976a236a8e2bf2137c21d3bf0a765cb06
Berk Hess [Wed, 2 Oct 2019 12:20:25 +0000 (14:20 +0200)]
Fix harmless OpenMP write race
Change-Id: I77e014ec2005e5289a0bd13ec608c73641928b54
Paul Bauer [Wed, 2 Oct 2019 13:57:41 +0000 (15:57 +0200)]
Fix failures in clang-8 nightly build
Change-Id: Icbf1bf47f4c526f1f04b9ba2e6ec3de54e06eeba
Mark Abraham [Fri, 20 Sep 2019 15:46:06 +0000 (17:46 +0200)]
Improve PME testing
The change to PME coordinate management uglified these tests even
further. Creation semantics are now more clear.
Change-Id: I309a487cdd218d85dbe4620e1d94e2f04f809a8d
Berk Hess [Wed, 25 Sep 2019 09:42:29 +0000 (11:42 +0200)]
Use gmx::Range in domdec
Renamed gmx_domdec_ns_ranges_t to DDPairInteractionRanges
and made it use gmx::Range. This is now used in an std::vector
in gmx_domdec_zones_t.
Note: more refactoring should be done in gmx_domdec_zones_t.
Change-Id: Iff61096ee8ce1e45a998a905fd56e0c2f43fa319
Artem Zhmurov [Tue, 3 Sep 2019 12:23:40 +0000 (14:23 +0200)]
StatePropagatorDataGpu object to manage GPU forces, positions and velocities buffers
In current version the positions and forces on the GPU are managed by different
modules, depending of the offload scenario for a particular run. This makes
management of the buffers complicated and fragile. This commit adds the object
responsible for management of the GPU buffers of coordinates, forces and
velocities. The object is connected to all clients that use coordinates, forces
and velocities buffers, while keeping the existing logic intact where its
possible.
Since the H2D and D2H copies are now done in nullptr stream, some of implicit
synchronization is lost. Consequently this commit does not always work
properly with newly introduced buffer ops / halo exchange features. To avoid
the confusion, GPU buffer ops are disabled by the assertion. There will be
a separate commit with all copies done synchronously, which will work
with the buffer ops. The stream- and event-based synchronization will be
introduced in the follow-up commits.
Refs. #2816.
Change-Id: I2e2ba1b6436f087d1f2fef4ff876445814a724e7
Mark Abraham [Mon, 30 Sep 2019 18:06:20 +0000 (20:06 +0200)]
Move GPU task assignment
This commit changes no functionality except that it does re-order some
log-file reporting, with GPU task assignment reporting now preceding
OpenMP and thread affinity reporting.
Now that there is multi-stage DD construction, the GPU task assignment
can be done between the stages. This will permit the construction of
GPU resources in time to build domain objects with GPU stream(s),
where appropriate.
Change-Id: I1734afa65ee5b54995ccf23a69ff25ec6587a019
Mark Abraham [Tue, 1 Oct 2019 13:54:06 +0000 (15:54 +0200)]
Fix icc warnings
This makes the intent clearer to see
Change-Id: I633d3921d125edf4cb529554b04fd8667a76843a
Berk Hess [Thu, 19 Sep 2019 14:59:26 +0000 (16:59 +0200)]
Completely remove charge groups
Charge groups remain in the .top and .tpr formats for backward
compatibility.
Change-Id: I9410d464dae07e0dac5bddfcc0e551c821963547
Berk Hess [Mon, 30 Sep 2019 20:13:27 +0000 (22:13 +0200)]
Simplify DD exclusion counting
Change-Id: I7004b6860fc0c21fc4d82378b4a86d29a94d9391
Christoph Junghans [Mon, 30 Sep 2019 23:46:46 +0000 (17:46 -0600)]
cmake: add missing header to legacy api
range.h is now needed by block.h
Change-Id: I372ddbe6475511d3e6e711e2f677792d568ac853
Pascal Merz [Tue, 1 Oct 2019 04:28:08 +0000 (22:28 -0600)]
Output correct kinetic at last step (modular simulator)
This replicates the fix in I9a0bc228 for the modular simulator.
Fixes a bug which would write an incorrect kinetic energy and
temperature to the energy and log files when using the leap-frog
integrator and having a last step not coinciding with an energy
calculation step.
Refs #2950
Change-Id: I9969fa8114240145d9aa84f381a898bbb7570b75
Mark Abraham [Tue, 21 May 2019 14:45:17 +0000 (16:45 +0200)]
Removed dependency on commrec of mdrun setup
Changes no functionality.
Setup is now parameterized directly on MPI_COMM_WORLD, which we will
want later for letting library-based callers pass in an
MPI_Communicator. This permits commrec to be initialized later, once
the threads have been spawned for the thread-MPI ranks.
The initialization of multi-simulations moves from LegacyMdrunOptions
to SimulationContext, which is more appropriate for
ensemble-parallelism established directly by the user.
Before the decision about the duty of a rank, there is no difference
between MASTER(cr) and SIMMASTER(cr), so several calls to macros
taking a t_commrec pointer are replaced by booleans. Introduced
findIsSimulationMasterRank to compute that value. This eliminates
early use of t_commrec that has necessitated other hacks and
workarounds.
Removed redundant check for replica exchange when the number of multi
simulations is less than two, because gmx_multisim_t constructor
already prohibits that.
Resolves several TODO items and improves modularity, too.
Refs #2587, #2605, #3081
Change-Id: I48bd3b713bc181b5c1e4cbcd648706a9f00eab96
Carsten Kutzner [Mon, 30 Sep 2019 12:28:29 +0000 (14:28 +0200)]
Move ForceProviders class into the gmx namespace
Since the C group kernels have been removed there is no need any more for
having ForceProviders as a struct outside the gmx namespace.
Change-Id: I2c91fbd06be289e3eb8acd74436d931cf3e994de
Mark Abraham [Mon, 30 Sep 2019 08:36:35 +0000 (10:36 +0200)]
Make a builder for DD
This creates the seam for task assignment to use the rank duty
assignment to choose GPU devices and construct resources like streams
before completing building the DD object with such resouces in place.
Used a pimpl class to keep the DD setup implementation details
private.
Change-Id: I6dbfe0122777bb529f8007386ff06b227e355263
Szilárd Páll [Mon, 30 Sep 2019 10:54:52 +0000 (12:54 +0200)]
Move PME GPU program building later
Moves the code closer to where it is needed and later during
initialization so even without a high-level manager, context between
nonbonded and PME could be be shared.
Change-Id: I47d9645dc0ee5574960964dfd3f79479ff21bcea
Berk Hess [Thu, 19 Sep 2019 14:42:07 +0000 (16:42 +0200)]
Remove charge group code from forcerec
Removed sevearal charge group members from t_forcerec.
Removed the cginfo intra-cg exclusion flags, as it is always true
for single atoms.
Removed loops over charge groups from init_cginfo_mb().
Change-Id: Ic91ea78dc341dc9972567a488294cb3d262ef9f6
Carsten Kutzner [Mon, 30 Sep 2019 12:56:18 +0000 (14:56 +0200)]
Corrected a few typos in documentation and comments
Change-Id: Ic697080a7c1f1a01bb8370563c1256990935d05e
Szilárd Páll [Tue, 10 Sep 2019 15:40:01 +0000 (17:40 +0200)]
Make the wait on PME GPU results conditional
When the PME forces are reduced on-GPU and no energy/virial output is
produced, we can avoid blocking waiting on the CPU for the PME GPU
taks to complete.
This however would break the timing accounting which needs to happen
after PME tasks completed. Hence the accounting is moved to the PME
output clearing.
Refs #3029, #2817
Change-Id: I4e7f3aa43754a187fe5d6b584803444967516958
Berk Hess [Sun, 29 Sep 2019 20:21:15 +0000 (22:21 +0200)]
Increase ti3p5 minimization test tolerance
This is needed because of larger differences when running
this test on multiple ranks.
Change-Id: I3056ae5933800e30e274791e8d27e2b8e79f6e04
Berk Hess [Sat, 28 Sep 2019 21:22:26 +0000 (23:22 +0200)]
Fix CG minimization with DD
A force for determining the CG direction was not communicated
correctly with domain decomposition to recent refactoring change
5ed943bc.
Fixes #3112
Change-Id: Ic33062f4aad0ccd7c1296c7351bc71d735ebba24
Artem Zhmurov [Fri, 27 Sep 2019 11:58:40 +0000 (13:58 +0200)]
Slight improvements to GPU update/constraints initialization
1. Assertions are updated to better correspond to the supported
conditions.
2. The message in the log now depends on whether or not there are
constraints in the system.
3. Redundant set(...)/setPbc(...) before the MD loop removed.
The getter for the total number of constraints is also added to
the Constraints class.
TODO: The assertions can be removed once the supported
functionality is expanded.
Change-Id: I96e2b993f79cf721f7aa48b9af0eab6500c593ba
Szilárd Páll [Fri, 27 Sep 2019 13:11:32 +0000 (15:11 +0200)]
Switch GPU update off by default
The code-path is untested and not fit for being on by default, so better
kept off until it becomes stable and better tested.
Change-Id: Ia651b0ae2f0f6751c13ea6857cdf6d176b83e34e
Artem Zhmurov [Thu, 26 Sep 2019 00:01:31 +0000 (02:01 +0200)]
Introduce stream management to GPU Update-Constraints
The GPU stream in the GPU versions of Leap Frog, LINCS and SETTLE
is now managed by the UpdateConstraints module and passed to the
members. This prepares Update-Constraints to be switched to
non-nullptr stream.
Change-Id: I95b9e4112874eb02ea646975760922ab31ce6b27
Berk Hess [Wed, 25 Sep 2019 09:19:31 +0000 (11:19 +0200)]
Extract gmx::Range
Added a range class for general use by extracting
gmx::RangePartinioning::Block to gmx::Range in range.h and renaming
and adding some methods.
Change-Id: Id9384c139d4daf1a31240e28333c9e9801c2dffc
Berk Hess [Fri, 27 Sep 2019 13:51:09 +0000 (15:51 +0200)]
Fix location of angle in vsite figure
The virtual site figure in the manual had an angle circle
in the wrong location (after shifting part of figure).
Fixes #3109
Change-Id: Ieb19485f231cff3e15bfd4158d64b82f7b69fac5
Mark Abraham [Fri, 27 Sep 2019 07:38:38 +0000 (09:38 +0200)]
Improve MiMiC test implementation
These tests were not using callGrompp in the intended way, and
inadvertently relying on behavior that might change when follow-up
work on #3081 changes mdrun thread-MPI initialization behavior
Change-Id: Icdbb6049a7b609ddaabc8e073f87e8bbb00da4f2
Mark Abraham [Wed, 25 Sep 2019 17:30:36 +0000 (19:30 +0200)]
Stop checking CUDA version reported by nvcc
We should trust FindCUDA.cmake to do its job (and not find headers
that don't match nvcc), and also prepare to use the native CUDA
support in CMake (where nvcc isn't used).
Removed unused GMX_CUDA_VERSION config variable
Change-Id: I8959c16c7e1247dc114bae818e0b867cf6b8ae71
Mark Abraham [Thu, 26 Sep 2019 17:54:20 +0000 (19:54 +0200)]
Fix gro file output with index groups
Fixes #3107
Change-Id: I3fc586c69066a354b3210d9616125ef666f1ce26
Mark Abraham [Thu, 26 Sep 2019 17:50:00 +0000 (19:50 +0200)]
Add tests for editconf file conversion with indexing
The indexed output to gro files illustrates bug #3107
Fixed two memory leaks necessary to let the tests pass
Refs #3107
Change-Id: Ie61430210ac8a804fb86783c8b76c41333b8d3cf
Mark Abraham [Tue, 13 Aug 2019 12:31:57 +0000 (14:31 +0200)]
Remove leftover support for pre-9.0 CUDA
Refs #2831
Change-Id: I7ec33bb3582006123e745d06da27c9eed12fbfc2
Berk Hess [Thu, 26 Sep 2019 21:26:33 +0000 (23:26 +0200)]
Made mdrun appending work again
Fix that mdrun would only append the log file and create new files
for all other file types.
Fixes #3108
Change-Id: I0fe517f4ce3f6630e589a65d106ca67bcb14765b
Paul Bauer [Wed, 25 Sep 2019 15:42:38 +0000 (17:42 +0200)]
Revert "Remove update flag again"
This reverts commit
85ef063c53f77a3423b2943d16d1d2f3759aa1bf.
Reason for revert: We want to have the user interface available again and tested.
Change-Id: I976531f37a9aaf2782b22f22653804ad0f303b95
Szilárd Páll [Wed, 25 Sep 2019 14:05:49 +0000 (16:05 +0200)]
Merge remote-tracking branch 'origin/release-2019'
Change-Id: Ia822fecb18b63d8a4e4408e056850a42875d57e8
Paul Bauer [Mon, 9 Sep 2019 09:04:37 +0000 (11:04 +0200)]
GROMACS 2020 first beta release
Updated regressiontest hash.
TODO branch 2020 from master and set correct regressiontest branch.
Change-Id: Ibe58dfbec933aa6166a7573d2d371614dbacb52f
Mark Abraham [Fri, 20 Sep 2019 16:55:32 +0000 (18:55 +0200)]
Fix log file flag output
Fixes #3099
Change-Id: I5049e8807b87c83959a0a2a5b3439b904d5edc8b
Paul Bauer [Tue, 24 Sep 2019 10:25:36 +0000 (12:25 +0200)]
Add test for noappend
New test covers behaviour where users always set the -noappend flag.
Also adds documentation for checkpoint header.
Change-Id: I012bb6d20eb297bb654cbddfe0c2530161fc902b
Mark Abraham [Tue, 24 Sep 2019 14:09:27 +0000 (16:09 +0200)]
Fix no-MPI CUDA build
Change-Id: I32407bf373acca24dd7250503ab78387fa6b8404
Mark Abraham [Tue, 24 Sep 2019 16:03:08 +0000 (18:03 +0200)]
Merge "Merge branch release-2019 into master"
Magnus Lundborg [Tue, 24 Sep 2019 09:43:44 +0000 (11:43 +0200)]
Add .part0001. to files from the first run with -noappend.
There was a bug where only the runs after the first got
.partxxxx. added when running with -noappend.
Fixes #3103
Change-Id: I36fddfca058216a50f9910c548b8a106bf460f7b
Paul Bauer [Mon, 23 Sep 2019 07:44:00 +0000 (09:44 +0200)]
Merge branch release-2019 into master
Resolved Conflicts:
cmake/gmxDetectCpu.cmake
src/gromacs/gmxpreprocess/topshake.cpp
src/gromacs/listed-forces/bonded.cpp
src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda_data_mgmt.cu
src/gromacs/mdlib/nbnxn_gpu.h
src/gromacs/mdlib/nbnxn_ocl/nbnxn_ocl.cpp
src/gromacs/mdlib/nbnxn_ocl/nbnxn_ocl_data_mgmt.cpp
This does not include the change to the MPI_COMPILE flags handling,
as the code in master has been reorganized in a way that is incompatible
to that.
Change-Id: Idb1da69ce68f3121442859e8cbe24495c7894726
Paul Bauer [Fri, 14 Jun 2019 14:50:22 +0000 (16:50 +0200)]
Read in TPR char buffer as vector
Perform the I/O of the TPR char buffer as xdr_vector operation instead
of using single bytes.
Also use the xdr vector specialization for unsigned char and rvecs.
Refs #2971
Change-Id: I20534985fbdee8108792f676b3cb4264ab74c456
Mark Abraham [Tue, 24 Sep 2019 10:01:29 +0000 (12:01 +0200)]
Fix clang unused-function warnings
There are many places inline functions in headers are declared that
won't be used so we should suppress the clang warning about that.
Change-Id: I01c26f68473543007427e5baafc74eee966c0e68
Berk Hess [Thu, 19 Sep 2019 20:02:43 +0000 (22:02 +0200)]
Remove obsolete dd_get_ns_ranges()
Change-Id: Id64f2d51d6684cf4c22ebd770963cc69f7af087b