alexxy/gromacs.git
6 years agoAdd checking function for whether a buffer is pinned
Mark Abraham [Fri, 17 Nov 2017 18:58:39 +0000 (11:58 -0700)]
Add checking function for whether a buffer is pinned

This is useful for several kinds of tests proposed.

Change-Id: If9fdd29e73f16299190b5485f473f6388aab9ec9

6 years agoImprove handling of PME GPU force buffer
Mark Abraham [Mon, 6 Nov 2017 08:28:01 +0000 (09:28 +0100)]
Improve handling of PME GPU force buffer

Managed it with the HostAllocator, and moved the responsibility
for its lifetime to the PME GPU staging structure. The buffer
does not use CUDA pinning yet.

Change-Id: Ia6fdbdb2509137fec1c6cf2a4ac8c04b1696b58f

6 years agoMake gpu_utils-test build with GMX_CLANG_CUDA
Aleksei Iupinov [Fri, 10 Nov 2017 15:04:35 +0000 (16:04 +0100)]
Make gpu_utils-test build with GMX_CLANG_CUDA

Same workarounds are applied to libgpu_utilstest as for libgromacs.
Renamed ligbpu_utilstest target to gpu_utilstest_cuda to avoid the
double "lib" prefix in the filename.

Refs #2259, #2293

Change-Id: I16b07a13ce2dca30079a889e2b314483d82d3674

6 years agoAlso print 1x1 pair-list setup to log
Berk Hess [Tue, 31 Oct 2017 10:03:02 +0000 (11:03 +0100)]
Also print 1x1 pair-list setup to log

mdrun now prints the equivalent 1x1 pair-list setup in addtion
to the NxM list setup. This is to clarify that we can use short
pair list buffers because of our cluster setup.
The list setup is now also printed in case we have a single list.
Removed the note on needing to increase nstlist with a GPU when
we automatically change nstlist.
Changed pick_nbnxn_kernel and nbnxn_atomdata_init to use mdlog
to get correct spacing between paragraphs.

Also cleaned up the verletbuf list setup getter functions.

Change-Id: Ic7b5967b0a62aee9fee9837f60a134fd571ff405

6 years agoRename and expose "generic" GPU memory transfer functions
Aleksei Iupinov [Thu, 9 Nov 2017 18:01:43 +0000 (19:01 +0100)]
Rename and expose "generic" GPU memory transfer functions

Dropped the "_generic" suffix from the names. Made the sync/async
argument an enum class instead of boolean.
Made PME use synchronous versions of the functions for unit tests.

Change-Id: I5fd2490d58370d9f0405aea1a74237fa8107cbab

6 years agoOnly issue FFT warning messages on changes
Erik Lindahl [Sun, 12 Nov 2017 13:14:10 +0000 (06:14 -0700)]
Only issue FFT warning messages on changes

Similar to other CMake modules, we should only issue
warnings at the first invocation, or if the FFT library
was changed.

Change-Id: I6dba59f1021984d9a744a55d797814c1c9d89b20

6 years agoPME-gather: 4xN SIMD
Roland Schulz [Sat, 8 Jul 2017 00:40:48 +0000 (17:40 -0700)]
PME-gather: 4xN SIMD

Speedup on KNL 11% for spread/gather (3% total) on ion-channel

Change-Id: I1a0624408b4e8f7bd441dfe2c260f80d211351d0

6 years agoImport cmake Modules/FindCUDA.cmake
Mark Abraham [Tue, 24 Oct 2017 15:40:58 +0000 (17:40 +0200)]
Import cmake Modules/FindCUDA.cmake

CUDA 9.0 issues large numbers of -Wundef warnings from its internal
headers. FindCUDA.cmake should be including such headers as "system"
headers, so to prepare for a patch where it is modified to do that,
this commit imports that file from v3.4.3 of the CMake repository,
because that is a choice likely to work with all future versions of
CMake.

It needs some supporting cmake files that are included unmodified,
so GROMACS does not assert copyright on those. The main FindCUDA.cmake
file is modified only to be able to find those files

Refs #2276

Change-Id: I69ad39dc805648a6cc5e27bb7fcd229f5f2a538a

6 years agoRename load1DualHsimd to loadU1DualHsimd
Roland Schulz [Thu, 9 Nov 2017 22:43:45 +0000 (14:43 -0800)]
Rename load1DualHsimd to loadU1DualHsimd

Documentation didn't require any alignment, test didn't use
alignment and all implementations didn't require any alignment.
But name suggested that alignment is required.
Only current usage had 2-wide alignment but requiring that
would make the function less general without any advantage.

Change-Id: I651c1327a3febc368cb4b039ad226d0771770e60

6 years agoAVX: Improve load1DualHsimd
Roland Schulz [Thu, 9 Nov 2017 23:02:41 +0000 (15:02 -0800)]
AVX: Improve load1DualHsimd

instr+uop: 4->3, throughput/port-pressure(on 5): 3->1
(IACA numbers for IVB-SKL)

Change-Id: Id768cb951dcbace1473448fcd63fa7d40b0e7da6

6 years agoRevert "Use -mavx2 -mfma instead of -march with AVX2"
Roland Schulz [Fri, 10 Nov 2017 02:08:08 +0000 (03:08 +0100)]
Revert "Use -mavx2 -mfma instead of -march with AVX2"

This reverts commit 062a6b81498b61b2bfc4ec7441b844d76aae445b.

Reason for revert: Breaks support for ICC (16-18) which doesn't have -mavx2 or -mfma.

Change-Id: I01cf3e9db332a405fd9419b6382240f5fcecf633

6 years agoRename synchronous GPU transfer functions to match the asynchronous ones
Aleksei Iupinov [Thu, 9 Nov 2017 18:07:30 +0000 (19:07 +0100)]
Rename synchronous GPU transfer functions to match the asynchronous ones

Change-Id: I5cb8e9cab208c1d0c62f985ec3140540ea427fb2

6 years agoPrepared t_mdatoms for using vector
Mark Abraham [Wed, 8 Nov 2017 11:22:57 +0000 (12:22 +0100)]
Prepared t_mdatoms for using vector

Wrapped it in another C++ class because the group-scheme kernels
compile as plain C and this permits the contained t_mdatoms to
be unmodified. The class has responsibility for maintaining the
allocations for any of the fields of t_mdatoms that need to be
managed with a std::vector plus perhaps an allocator.

Change-Id: I6fef70beeb8d43f3e048cec02380f8ebf8153ecb

6 years agoIntroduce HostAllocationPolicy
Mark Abraham [Mon, 6 Nov 2017 07:45:49 +0000 (08:45 +0100)]
Introduce HostAllocationPolicy

This permits host-side standard containers and smart pointers to have
their contents placed in memory suitable for efficient GPU transfer.

The behaviour can be configured at run time during simulation setup,
so that if we are not running on a GPU, then none of the buffers that
might be affected actually are. The downside is that all such
containers now have state.

Change-Id: I9367d0f996de04c21312cef2081cc08148f80561

6 years agoICC should use ZMM if code anyhow uses ZMM
Roland Schulz [Mon, 30 Oct 2017 18:33:33 +0000 (11:33 -0700)]
ICC should use ZMM if code anyhow uses ZMM

Change-Id: Iaea73df12065b3d4ba1974e48b864f44c9b7fe44

6 years agoFix scalar blend
Roland Schulz [Wed, 25 Oct 2017 19:24:42 +0000 (12:24 -0700)]
Fix scalar blend

Change-Id: I580af279cdba494ec13029259e4fd0867a7e5ea2

6 years agoUpdate to TNG v 1.8.1
Magnus Lundborg [Mon, 23 Oct 2017 11:20:00 +0000 (13:20 +0200)]
Update to TNG v 1.8.1

Fixes #2187 and #2250.

Change-Id: Icf81d5f3ce916e984750e1511d32e16ebc45b6f9

6 years agoUse -mavx2 -mfma instead of -march with AVX2
Szilárd Páll [Wed, 18 Oct 2017 15:01:54 +0000 (17:01 +0200)]
Use -mavx2 -mfma instead of -march with AVX2

This was (likely) only a workaround for some early gcc version that did
not support correct AVX2 code-generation with just the -mavx2 -mfma
flags. However, just as with other SIMD flavors with AVX2 too we should
not request arch-specific tuning just to get the desired SIMD flavor
enabled.

Change-Id: Ib0c6388bebcffbf0719b438451d3943f51fba4a4

6 years agoReform gmx_pme_pp alloc and use vector
Mark Abraham [Tue, 7 Nov 2017 02:26:32 +0000 (03:26 +0100)]
Reform gmx_pme_pp alloc and use vector

Introduced a helper struct for describing the partner PP ranks.

Reduced some of the conditional compilation.

Updated some naming from node to rank.

Fixed over-use of charge_pp.

Change-Id: I00b59dd116740721ed707af4242c0d44f1615d56

6 years agoIntroduce gmxopencl.h
Mark Abraham [Mon, 6 Nov 2017 07:43:24 +0000 (08:43 +0100)]
Introduce gmxopencl.h

This header wraps the different ways to include the main OpenCL header
on different platforms, including suppressions for the warnings about
usage of deprecated API elements. NVIDIA only official supports the
version with the deprecated elements, so we need to continue to use it.

Change-Id: Ie24f20d43272e1747bcbd693815e96cc200d5f50

6 years agoMerge common nbnxn CUDA/OpenCL GPU wait code-paths
Szilárd Páll [Fri, 20 Oct 2017 20:26:25 +0000 (22:26 +0200)]
Merge common nbnxn CUDA/OpenCL GPU wait code-paths

The entire GPU wait including timing accumulation as well as staging
data reducion of the nonbonded GPU modules has been unified by
including a single templated version of the code into the common header.
Code has only been moved and changed in minor ways when necessary (e.g.
for the rvec reduction).

Change-Id: Ic9c9690be58a78f92ca99d2af30068e19c19cc6c

6 years agoTest clang on ARM in nightly matrix
Mark Abraham [Tue, 10 Oct 2017 10:09:10 +0000 (10:09 +0000)]
Test clang on ARM in nightly matrix

Also suppress lots of compiler warnings from useless use of
__vectorcall on this target for this compiler.

ARM are targetting clang for future development, so hopefully this
either isn't needed or will work in future. Either way, this change
will continue to do the right thing.

Change-Id: I211952a24aefee8434cc6b32322f359b2a22687b

6 years agoAdd wallcycle timer for the PME GPU F reduction
Szilárd Páll [Mon, 23 Oct 2017 14:11:46 +0000 (16:11 +0200)]
Add wallcycle timer for the PME GPU F reduction

Change-Id: I85185f2acdf3ebdcbac109ef723eb458bc0e9008

6 years agoSplit off nbnxn GPU timing and staging reduction
Szilárd Páll [Fri, 20 Oct 2017 17:52:13 +0000 (19:52 +0200)]
Split off nbnxn GPU timing and staging reduction

Code reorganization that moves the timing related functions as well as
energy and shift force reduction into separate functions in both CUDA
and OpenCL versions of nbnxn_gpu_wait_for_gpu().

Change-Id: Ic5c9694d9de7f80a772e97f5c9e05bab77a3b82a

6 years agoImprove PME includes
Mark Abraham [Tue, 7 Nov 2017 01:52:18 +0000 (02:52 +0100)]
Improve PME includes

Changing an internal ewald-module header for GPU support should not
lead to files outside that module needing to be recompiled. Moved enum
declarations for use outside the module to the header file that
declares such things. Restored necessary includes that were being
satisfied transitively from the internal header, that were prematurely
removed in fae8902688dc48be56e.

Change-Id: I18c3146e80aba9ad0a2c485f2355bc214cbb083c

6 years agoDeduplicate CUDA and OpenCL timer struct
Szilárd Páll [Fri, 20 Oct 2017 18:55:45 +0000 (20:55 +0200)]
Deduplicate CUDA and OpenCL timer struct

The struct is identical in both CUDA/OpenCL so it's better placed in a
common header, but this needs to be an internal-only header as it pulls
in CUDA dependencies.

Change-Id: I907d68b7c298f2ba0e7a1af2baf4819f637e2f2e

6 years agoFixed check for water in gen_vsite.cpp
David van der Spoel [Thu, 12 Oct 2017 07:06:44 +0000 (09:06 +0200)]
Fixed check for water in gen_vsite.cpp

Pdb2gmx would break when generating virtual sites if water oxygens
were not named OW. Now checking for the atomnumber instead.

Fixes #2268

Change-Id: I326f683e4940ad02351dcbe0c00e266a82b203f6

6 years agoMerge "Merge branch release-2016"
Mark Abraham [Fri, 3 Nov 2017 01:05:42 +0000 (02:05 +0100)]
Merge "Merge branch release-2016"

6 years agoFix Ekin at step 0 with COM removal
Berk Hess [Wed, 1 Nov 2017 16:21:48 +0000 (17:21 +0100)]
Fix Ekin at step 0 with COM removal

The kinetic energy at step 0 was computed from the velocities without
the center of mass velocity removed. This could cause a relatively
large jump in kinetic energy, especially for small systems.
Now compute_globals is called twice with COM removal so we get
the correct kinetic energy.

Appropriate mdrun tests for energy-conserving integrators are also added.

Change-Id: I87ab08d21a35621735ab3c65fc50af9992120be3

6 years agoNew mdp input for electric fields.
David van der Spoel [Tue, 31 Oct 2017 12:25:56 +0000 (13:25 +0100)]
New mdp input for electric fields.

New format for MDP input for electric fields that is consistent
with the manual and that is comprehensible.

Change-Id: I5f9f434080f5217d2473c16377aee962692b9ee9

6 years agoReplace math.h by cmath includes in cpp files
Aleksei Iupinov [Tue, 31 Oct 2017 22:51:19 +0000 (23:51 +0100)]
Replace math.h by cmath includes in cpp files

Partially fixes #2285 (for non-GPU build)

Change-Id: I638a0b8ba5e4e04e00730b01640ac7c6a41834ed

6 years agoMerge branch release-2016
Mark Abraham [Thu, 2 Nov 2017 09:43:11 +0000 (10:43 +0100)]
Merge branch release-2016

Ensured fix for gmx compare cmp_atoms went to the right code.

Change-Id: Iabc8ec03e7ebc45517f63697c3e7dea12b3f5398

6 years agoAdd missing Ewald correction for pme-user
Berk Hess [Thu, 2 Nov 2017 08:42:39 +0000 (09:42 +0100)]
Add missing Ewald correction for pme-user

With coulomb-type = pme-user, the Ewald mesh energy was not subtracted
leading to (very) incorrect Coulomb energies and forces.

Fixes #2286

Change-Id: Idfef9896d484e254264150e718c5516a832a2ad4

6 years agoSmall change to LaTeX manual generation
Paul Bauer [Mon, 30 Oct 2017 14:40:16 +0000 (15:40 +0100)]
Small change to LaTeX manual generation

Removed the gmxlite if statements in the pdf manual source files. They
made it more difficult to generate the new markup style files and are
apparently not needed.

Change-Id: Ica401f103c8f9682c7a45bdd90aa8680db7ff56a

6 years agoFix thread-MPI rank choice for orientation restraints
Mark Abraham [Mon, 30 Oct 2017 17:13:07 +0000 (18:13 +0100)]
Fix thread-MPI rank choice for orientation restraints

Only a single rank is supported, so that must be what the thread-MPI
code will choose. There's another check later on that catches the
multi-rank MPI case.

Change-Id: I9ccf5fbe958fc0c004a89ebc92a352460e9cba1f

6 years agoRemove unused PME GPU declarations
Aleksei Iupinov [Wed, 1 Nov 2017 11:35:53 +0000 (12:35 +0100)]
Remove unused PME GPU declarations

Change-Id: If64bcf73e825f6cd5ba48345f931c9dd25241046

6 years agoMove pme_gpu_finish_computation() documentation to the declaration
Aleksei Iupinov [Wed, 1 Nov 2017 11:31:52 +0000 (12:31 +0100)]
Move pme_gpu_finish_computation() documentation to the declaration

Change-Id: I4970424eb5108e51c6e8b00b55a60854900e16b9

6 years agoFixing missing references in web documentation
Paul Bauer [Wed, 1 Nov 2017 11:44:51 +0000 (12:44 +0100)]
Fixing missing references in web documentation

Change-Id: Ifca209c15f4cec3fed24e2070df8fa85320d02dd

6 years agoFix erroneous PME GPU "step" namings
Aleksei Iupinov [Tue, 31 Oct 2017 16:15:51 +0000 (17:15 +0100)]
Fix erroneous PME GPU "step" namings

Previous PME GPU code/documentation assumed single PME computation
per MD step, while there can actually be several. This change
replaces erroneous "step" names in the PME GPU module with
"(PME) computation" and similar.

Change-Id: Id230e848e0db0648a429bfc35a59106d1db1f7c9

6 years agoImprove handling of GPU IDs
Mark Abraham [Wed, 25 Oct 2017 10:08:01 +0000 (12:08 +0200)]
Improve handling of GPU IDs

Shifted responsibility for handling parsing of mdrun -gpu_id to early
in the runner, rather than as part of the assignment process.

Moved utility string handling + tests to taskassignment module, since
they only supported this process. Updated string handling in gmx
tune_pme to use more std::string and use the new
functionality. makeGpuIds will be used to replace the code in
assign_rank_gpu_ids in a subsequent patch.

Change-Id: I8d39cc69d0f96ac395858ed7cbe9f2947081b384

6 years agoSimplify PME GPU synchronization code
Aleksei Iupinov [Fri, 27 Oct 2017 13:37:47 +0000 (15:37 +0200)]
Simplify PME GPU synchronization code

Most synchronization events are removed; synchronization is mostly
done by a single stream synchronization call at the end of the step.

Change-Id: Ia793f2623d81ae8e3f6dfb5c84a6a636e422d982

6 years agoReuse epbcXY logic
Aleksei Iupinov [Tue, 31 Oct 2017 15:42:25 +0000 (16:42 +0100)]
Reuse epbcXY logic

Change-Id: I9ec7521b050521932b64b2b08a58c7b530975fb0

6 years agoFix nstlist increase warning print
Szilárd Páll [Tue, 31 Oct 2017 14:11:38 +0000 (15:11 +0100)]
Fix nstlist increase warning print

The log file warning print had a buggy conditional which this commit
fixes.

Change-Id: Ic106fa3fba54b2c394818e3a642f462d2675a2b1

6 years agoCheck CUDA available/compiled code compatibility
Szilárd Páll [Mon, 16 Oct 2017 15:40:23 +0000 (17:40 +0200)]
Check CUDA available/compiled code compatibility

Added an early check to detect when the gmx binary does not embed code
compatible with the GPU device it tries to use nor does it have PTX that
could have been JIT-ed.

Additionally, if the user manually sets GMX_CUDA_TARGET_COMPUTE=20 and
no later SM or COMPUTE but runs on >2.0 hardware, we'd be executing
JIT-ed Fermi kernels with incorrect host-side code assumptions
(e.g amount of shared memory allocated or texture type).
This change also prevents such cases.

Fixes #2273

Change-Id: I5472b1a33e584a75f451e21e9fd25992633fbea9

6 years agoUpdate treatment of GPU compatibility data structure
Mark Abraham [Wed, 25 Oct 2017 09:45:32 +0000 (11:45 +0200)]
Update treatment of GPU compatibility data structure

Now we only construct the vector of compatible GPUs once per mdrun,
and are less coupled to hw_info and gpu_info structs.

Change-Id: I181f0486d0ea1670de7a85046c94c1fef83dce17

6 years agoFix nstlist increase warning print
Szilárd Páll [Tue, 31 Oct 2017 14:18:27 +0000 (15:18 +0100)]
Fix nstlist increase warning print

The log file warning print had a buggy conditional which this commit
fixes.

NOTE: skip when merging, upstream fix submitted separately.

Change-Id: Id85223a3f762bbab26525a60987870d77cd5a01c

6 years agoFixed mdp output from electric field code.
David van der Spoel [Mon, 30 Oct 2017 08:03:13 +0000 (09:03 +0100)]
Fixed mdp output from electric field code.

Added two new tests for MDP output.

Fixes #2258

Change-Id: I495454bd2349be836c1a3ef5985288a996abf20e

6 years agoFix reference mode build unused function warnings
Aleksei Iupinov [Mon, 30 Oct 2017 10:56:17 +0000 (11:56 +0100)]
Fix reference mode build unused function warnings

Change-Id: Ibd1ad83c5dbeffe86e47156d456d78ab1ab8aeeb

6 years agoRemove unused sign parameter from dih_angle()
Berk Hess [Sun, 29 Oct 2017 21:20:54 +0000 (22:20 +0100)]
Remove unused sign parameter from dih_angle()

Change-Id: I88a73ca49b6acfc59b4baf0d847aa81542a870ca

6 years agoArrayRef: Replace fromVector with subArray
Roland Schulz [Fri, 13 Oct 2017 18:49:46 +0000 (11:49 -0700)]
ArrayRef: Replace fromVector with subArray

Creating ArrayRef from iterators is potentially dangerous,
because it is incorrect for non-contiguous containers.

arrayRefFromVector(v.begin()+start, v.begin()+start+length)
is replaced with
ArrayRef<T>(v).subArray(start, length)

Also:
- Combine all conversion constructors
  Removes code duplication and makes conversion more powerful
  (e.g. base pointer or containers with allocators).
- remove fromPointers and arrayRefFromPointers
  Wasn't used by any code
- remove fromArray and replace wih arrayRefFromArray

Change-Id: I05ad6b285ece58739d9f5bce48f9ecf4ade3454e

6 years agoAdded option -water tips3p to pdb2gmx.
David van der Spoel [Fri, 13 Oct 2017 16:36:27 +0000 (18:36 +0200)]
Added option -water tips3p to pdb2gmx.

Fixes #2272

Change-Id: Ibfc63009767fd667df51ff10041791268351e1ca

6 years agoBring PME GPU/CUDA internal structure names to CamelCase
Aleksei Iupinov [Fri, 27 Oct 2017 11:01:19 +0000 (13:01 +0200)]
Bring PME GPU/CUDA internal structure names to CamelCase

This only does mechanical renaming (e.g. pme_gpu_settings_t to
PmeGpuSettings). Any meaningful renames will be done separately.

Change-Id: I7ea2af94fd0212ff6edcf433ff21842c5bbb67b0

6 years agoFix and update hw_info
Mark Abraham [Tue, 24 Oct 2017 19:59:45 +0000 (21:59 +0200)]
Fix and update hw_info

Stopped using typedef struct (so later we can put a vector into the
struct).

Managed the memory using a unique_ptr, and made the interface reflect
that it is a file static, rather than something that is owned by
e.g. the runner.

Amended docs to clarify the sense of "global."

Change-Id: I1ce9bc42e03668498051b59aaeeb9e50a9f6f762

6 years agoUse new/delete for gmx_pme_t
Aleksei Iupinov [Fri, 27 Oct 2017 13:09:04 +0000 (15:09 +0200)]
Use new/delete for gmx_pme_t

Change-Id: I176b1d26d484514c65cae412c474b65410191d38

6 years agoSimplify PME data handling in runner
Aleksei Iupinov [Thu, 26 Oct 2017 15:04:34 +0000 (17:04 +0200)]
Simplify PME data handling in runner

Differing ownership of the PME data for PME-only and other ranks
is now hidden behind a reference. gmx_pme_init() now returns
a pointer to the allocated structure.

Change-Id: Ia9c5117a0db43a6564298dd621cf9254f0423acf

6 years agoMake PME tuning logic more readable
Aleksei Iupinov [Thu, 26 Oct 2017 14:48:06 +0000 (16:48 +0200)]
Make PME tuning logic more readable

Change-Id: Ie53693a84264ed33c17894aa551cf476a3ced26b

6 years agoRemove incorrect comment for CHARMM tips3p
Berk Hess [Sun, 29 Oct 2017 21:12:20 +0000 (22:12 +0100)]
Remove incorrect comment for CHARMM tips3p

Change-Id: I383e28a7b75aa3654a65d15358820a28f9163308

6 years agoRemove unused PME grid dump debug functions
Aleksei Iupinov [Thu, 26 Oct 2017 11:25:23 +0000 (13:25 +0200)]
Remove unused PME grid dump debug functions

Change-Id: Iac748080fdf29e6f35ecf37de2b968e70c72605e

6 years agoFix hw detection more
Mark Abraham [Thu, 26 Oct 2017 08:36:13 +0000 (10:36 +0200)]
Fix hw detection more

gmx_hardware_detect was called in response to GoogleTest environment
SetUp function, so the cleanup for its global should occur in response
to the corresponding TearDown function. Both those should be virtual.
Thus the hardwareInfo should not be in a smart pointer called by a
destructor that might be called at a different point from TearDown.

The new getter function and the callback that handles making the first
call to it conform better to GoogleTest's recommendation to arrange to
call AddGlobalTestEnvironment from main() rather than rely on static
initialization.

Made hardwareInit a non-member function because that improves
encapsulation.

Change-Id: I2f8e14ecc1707bf31d023a4eb4fea0a20543910b

6 years agoReplace a few asserts with GMX_ASSERT's
Aleksei Iupinov [Thu, 26 Oct 2017 08:36:48 +0000 (10:36 +0200)]
Replace a few asserts with GMX_ASSERT's

Change-Id: I18e614de57fc06f3faabc687140821223bd7c4f4

6 years agoRemove defunct PME initialization error code return
Aleksei Iupinov [Thu, 26 Oct 2017 11:49:22 +0000 (13:49 +0200)]
Remove defunct PME initialization error code return

The error was never actually returned, and invalid inputs
are already treated with exceptions anyway.

Change-Id: I6063612c3a2e760fb56b7bdf5b1624ab2fc031bd

6 years agoMake release matrix work again
Mark Abraham [Mon, 9 Oct 2017 11:50:25 +0000 (13:50 +0200)]
Make release matrix work again

Seems we didn't test this matrix when we updated infrastructure some
time.

Change-Id: Ib19672db6144bb40f08d2fcace4d43dbd52e6823

6 years agoReorganize PME GPU launch
Szilárd Páll [Mon, 16 Oct 2017 18:15:25 +0000 (20:15 +0200)]
Reorganize PME GPU launch

Wrapped the first (prep/spread) and second stage (fft/gather) of PME GPU
in functions. Moved the second stage of the regular PME GPU mode to after
the nonbonded x transform to ensure that the transform can overlap with
spread even when the launch overhead of the FFT kernels is high.

Also removed TPI-related PME-GPU launch conditions as this should be
checked much earlier. Noted in the force flags docs that the current
code assumes GMX_FORCE_STATECHANGED is used only with TPI.

Change-Id: I7f765d66c6c4e7e54812b81b2dd23751af0b06b5

6 years agoNew quote
Mark Abraham [Wed, 18 Oct 2017 21:02:44 +0000 (23:02 +0200)]
New quote

Change-Id: Id1625b1c836c64a5bd1e24fbf5b3ef2b104f102d

6 years agoTeach the copyright checker about template.cpp
Mark Abraham [Tue, 24 Oct 2017 16:03:24 +0000 (18:03 +0200)]
Teach the copyright checker about template.cpp

Now it won't warn about having to ignore it when calling
uncrustify via the scripts.

Change-Id: I2d5675a1a16dc01f6f9a45440f6807319c766944

6 years agoFix gmx check for tprs with different #atoms
Berk Hess [Tue, 24 Oct 2017 19:01:09 +0000 (21:01 +0200)]
Fix gmx check for tprs with different #atoms

Fixes #2279.

Change-Id: I0a56cb30922ba2831bd6177ca6025e15a25dbed6

6 years agoRemove duplicate/outdated function declaration
Aleksei Iupinov [Tue, 24 Oct 2017 14:24:11 +0000 (16:24 +0200)]
Remove duplicate/outdated function declaration

Change-Id: Ie54afcc501e0658c8a29a80831ff87765e5e7786

6 years agoMove commrec duty checking into simple getters
Aleksei Iupinov [Tue, 17 Oct 2017 11:15:24 +0000 (13:15 +0200)]
Move commrec duty checking into simple getters

This isolates all reads of cr->duty, and asserts on cr->duty being valid,
allowing to refactor its assignment in later changes.

Change-Id: I9b48be06b8d2db18105619ea1acfe38aa541b622

6 years agoCorrect the page allocator description
Aleksei Iupinov [Mon, 23 Oct 2017 09:14:19 +0000 (11:14 +0200)]
Correct the page allocator description

Change-Id: Iea0190978b483ed01fc8279f34ef2a304a11b612

6 years agoEliminate some OCL/CUDA code code duplication
Szilárd Páll [Fri, 20 Oct 2017 14:37:49 +0000 (16:37 +0200)]
Eliminate some OCL/CUDA code code duplication

Atom to interaction locality conversion and atom range calculation has
been duplicated across the OpenCL and CUDA modules. As an intermediate
step this functionality is now gathered in the common header.

Change-Id: I55b1b34992621ecebed6dad0978a47553511fc87

6 years agoFix incorrect dV/dlambda for walls
Berk Hess [Tue, 10 Oct 2017 07:23:30 +0000 (09:23 +0200)]
Fix incorrect dV/dlambda for walls

The free-energy derivative dV/dlambda for walls, which can
be perturbed by changing atom types of non-wall atoms, only
contained the B-state contribution.

Fixes #2267

Change-Id: I7c6d1b57d1e0e173e1461d55855df45c489e082a

6 years agoSplit off the NMR related analyses from gmx energy.
David van der Spoel [Tue, 23 May 2017 11:20:30 +0000 (13:20 +0200)]
Split off the NMR related analyses from gmx energy.

A new tool gmx nmr is created by straight copying code from
gmx energy to a new tool. The reason is to reduce complexity.

A few cleanups are introduced to pass the valgrind memory
test.

Added references the gmx nmr in the manual.

Change-Id: I8e4d1dec8806a0518c571d7a01c4f70de5bbbd35

6 years agoChange PME GPU gather reduction argument from boolean to enum class
Aleksei Iupinov [Fri, 20 Oct 2017 12:53:01 +0000 (14:53 +0200)]
Change PME GPU gather reduction argument from boolean to enum class

Change-Id: Idacbdfb79313ebf16cf1a7dc19435436d6366d27

6 years agoMerge "Merge branch release-2016"
Berk Hess [Sat, 21 Oct 2017 19:33:39 +0000 (21:33 +0200)]
Merge "Merge branch release-2016"

6 years agoAdd calls to the PME GPU stages
Aleksei Iupinov [Tue, 30 May 2017 14:00:35 +0000 (16:00 +0200)]
Add calls to the PME GPU stages

This adds the inactive calls to PME GPU stages both for PP+PME
and PME-only ranks.

Ref #2054

Change-Id: I5af2ab95cedff422c39592255f01205d42fc7eb7

6 years agoClarify docs for Fmax in EM
Mark Abraham [Fri, 20 Oct 2017 13:06:58 +0000 (15:06 +0200)]
Clarify docs for Fmax in EM

Change-Id: I388c653b3e277289fa1b1ad0ae9f2679679b9cb8

6 years agoMerge branch release-2016
Mark Abraham [Fri, 20 Oct 2017 10:05:05 +0000 (12:05 +0200)]
Merge branch release-2016

Dropped the post-submit change because where we test
with older clang we now always specify a suitable
status for openmp.

Change-Id: I993da20856861c0b8a0888f7fa0deed8853349a8

6 years agoFix warning for confout with periodic molecules
Berk Hess [Wed, 18 Oct 2017 18:18:31 +0000 (20:18 +0200)]
Fix warning for confout with periodic molecules

With periodic molecules, mdrun would, incorrectly, attempt to make
molecules whole for writing the final state to confout.

Fixes #2275

Change-Id: Ib19ca5c2ae6fcca6126773bcdd8a05c8e141c3ce

6 years agoDisable OpenMP with clang 3.4 post-submit config
Szilárd Páll [Thu, 19 Oct 2017 16:53:04 +0000 (18:53 +0200)]
Disable OpenMP with clang 3.4 post-submit config

This avoid a CMake warning which we now parse and catch in jenkins.

Fixes #2277

Change-Id: Id129f9907af32bdecfe07c4ca37d4cb7376d79e2

6 years agoRemove unised OpenCL debugging helpers
Szilárd Páll [Tue, 17 Oct 2017 18:29:58 +0000 (20:29 +0200)]
Remove unised OpenCL debugging helpers

This also helps avoid -Wmissing-declarations in OpenCL utils module.

Change-Id: I16584fca485790e98fd3865ac65c06ac78d58194

6 years agoMade CUDA PME texture reference conditional
Berk Hess [Mon, 16 Oct 2017 07:24:05 +0000 (09:24 +0200)]
Made CUDA PME texture reference conditional

Without texture support we should not reference textures.
Also added const struct to textures.

Change-Id: I1ca4e534da7b9130d12fd6831c119d2139eb16eb

6 years agoFixed missing entries in nrnb arrays.
David van der Spoel [Tue, 17 Oct 2017 20:26:21 +0000 (22:26 +0200)]
Fixed missing entries in nrnb arrays.

Some nrnb index entries were missing in the interaction_function
array, others were zero leading to that the wrong megaflops
accounting was printed.

Fixes #2274

Change-Id: Ic0b05d30eb5fdfeb7f3e822b42ec7ca4cda58bc5

6 years agoMerge branch release-2016
Mark Abraham [Mon, 16 Oct 2017 06:10:34 +0000 (08:10 +0200)]
Merge branch release-2016

Change-Id: Ia56e987f52e4dee425b12b02940ad9ca18d0c13a

6 years agoImprove vsite parallel checking
Berk Hess [Sun, 24 Sep 2017 20:27:02 +0000 (22:27 +0200)]
Improve vsite parallel checking

The vsite struct now stores internally whether it has been configured
with domain decomposition. This allows for internal checks on valid
commrec, which have now been added.
The vsite constructor now initializes to atom range to invalid values,
so we can check that the thread splitting has been called before
constructing. This would have caught bug #2257.
Removed the vsite struct from the global construct function argument
list, which simplifies the vsite code in several places and
fixes #2257.

Also some general clean-up: removed some snews, added some camelCasing
and doxygen documentation.

More renaming would be beneficial, but should be a separate commit.

Change-Id: I467ec8b8ebfa0da090d4ac0a1d096ad9fab87eb5

6 years agoRelax PME spline computation tolerance in double precision tests
Aleksei Iupinov [Fri, 13 Oct 2017 09:50:19 +0000 (11:50 +0200)]
Relax PME spline computation tolerance in double precision tests

Change-Id: I8c3502dd84e21d20be057d47d4afa589d779eb90

6 years agoUpdate tests for C++11 compiler and standard library
Mark Abraham [Thu, 9 Feb 2017 10:49:38 +0000 (11:49 +0100)]
Update tests for C++11 compiler and standard library

We've started using some more features, so broaden the
range of things for which we check at cmake time.

Also made an explicit error message for older icc that can't handle
newer gcc standard libraries, since this might come up a few times.

Fixes #2116

Change-Id: I3656edb3f7e6f81bbf6ed3ed764bcac56802f87f

6 years agoReplace all ConstArrayRef with ArrayRef<const T>
Roland Schulz [Wed, 4 Oct 2017 06:48:49 +0000 (23:48 -0700)]
Replace all ConstArrayRef with ArrayRef<const T>

1) Remove the alias itself in arrayref.h.
2) All replacements done automatically using sed:
s#ConstArrayRef<const char \*>#ArrayRef<const char *const>#
s#ConstArrayRef<\(.*\)>#ArrayRef<const \1>#

This worked because "const char*" was the only pointer type used as
template argument.

Change-Id: I5eba895a5dc235b95d77670b4f258e423f64f3b8

6 years agoSpecialize ArrayRef for SimdReal
Roland Schulz [Fri, 22 Sep 2017 20:43:50 +0000 (13:43 -0700)]
Specialize ArrayRef for SimdReal

ArrayRef<SimdReal> maps to a range of aligned memory and returns a
Simd type from operator[] (more precisely a reference to a Simd type).
This allows to iterate over memory and not have to explicitly call
load/store while also avoiding undefined behavior (strict aliasing rule)
caused by casting between reals and SimdReals.

Change-Id: I3d00df088669dacc810052cbcaebe15e62e1d530

6 years agoDo not include headers related to ObservablesHistory
Magnus Lundborg [Tue, 10 Oct 2017 12:13:45 +0000 (14:13 +0200)]
Do not include headers related to ObservablesHistory

Define  destructor for ObservablesHistory to avoid having to include
many extra headers.

Change-Id: I2681b519ace728dc494f967d17db5478af09f5df

6 years agoFix cpuinfo on clang + non-x86
Mark Abraham [Tue, 10 Oct 2017 09:22:02 +0000 (09:22 +0000)]
Fix cpuinfo on clang + non-x86

Compilers that pretend to be GCC often define such symbols, and the
support for inline assembly does not compile e.g. on ARM. This broke
CPU detection at cmake time, and subsequent compilation. Probably
introduced by commit 863768a4dad. The latest ARM compiler is based on
clang, so we should fix this.

Also de-duplicated some use of compiler target defines

Change-Id: Ia21363b9c0fe112762750d93b9feea267a34319f

6 years agoRemove the size_t from the PME gather CUDA kenels
Szilárd Páll [Thu, 12 Oct 2017 15:52:15 +0000 (17:52 +0200)]
Remove the size_t from the PME gather CUDA kenels

Change-Id: If53b9eabc1ac081b33933cc773b5ea932c9e8392

6 years agoRemove useless extern CUDA texture reference declarations from PME
Aleksei Iupinov [Thu, 12 Oct 2017 11:00:59 +0000 (13:00 +0200)]
Remove useless extern CUDA texture reference declarations from PME

These are only accessed from the same compilation unit (pme-spread.cu)
on the device side, and the host side is only using nearby getters.

Change-Id: Ie846193c71142ff5e519e990ef1155b534546a9b

6 years agoRevert "Drop NB_ from GMX_CUDA_NB_SINGLE_COMPILATION_UNIT cmake define"
Aleksei Iupinov [Thu, 12 Oct 2017 10:31:45 +0000 (12:31 +0200)]
Revert "Drop NB_ from GMX_CUDA_NB_SINGLE_COMPILATION_UNIT cmake define"

This reverts commit 3880255b0, which was made in confusion
stemming from combination of multiple CUDA compilation units,
disabling CUDA textures, and NB CUDA module structure.
The define in question is actually NB-exclusive,
and PME with CUDA does not need to check it to declare
extern texture references. As PME textures are not accessed
from different PME kernels, those extern declarations are removed
in the child change Ie846193c71142ff5e519e990ef1155b534546a9b.

Change-Id: I75a0e62bc92c7161ba0fbf00d8db2f35cef80bc7

6 years agoSimplify virial handling
Berk Hess [Sat, 30 Sep 2017 21:10:06 +0000 (23:10 +0200)]
Simplify virial handling

The force and virial are tightly connected. This is now expressed
through the new ForceWithVirial object, which is used for algorithms
that compute a separate virial contribution. This clarifies and
simplifies the core mdrun code in several places.

Change-Id: If0f65f1a6f67fb3efc5e4637a183faf4abd5f969

6 years agoRequire template parameter for load function
Roland Schulz [Fri, 6 Oct 2017 23:36:50 +0000 (16:36 -0700)]
Require template parameter for load function

The implicit conversion from load(float*) to both float
and SimdFloat caused multiple issues. The primary ones:
- Extra complexity in the implementation of traits, ArrayRef, SimdReference
- required compiler tests for ambiguity
- SimdReal x = f(load(m)) //confusing broadcast if f is scalar function
- x = s*load(m) //error-prone scalar multiply if s is scalar

New syntax in templated function is load<T>(m) and in non-templated function
load<SimdReal>(m). While this is slightly longer by itself, it is clearer
and doesn't require to store values in tempories (no ambigious overload errors).

Also avoids the need for the load proxies.

Change-Id: I8109e9365e956aaea428ec338b6a810444e03d77

6 years agoUse tag for simdLoad
Roland Schulz [Sat, 7 Oct 2017 00:34:02 +0000 (17:34 -0700)]
Use tag for simdLoad

Use same simdLoad name for all types. In preparation
for removing the need for SimdLoadProxyInternal.

C++ doesn't support template specialization for
function thus making simdLoad have a template argument
and specialize on it doesn't work. By passing a tag as
a 2nd argument std overloading can be used.

Change-Id: Iaf42ebb74a3347787bcac3bdfd0ef11db1e333bf

6 years agoIntroduced header for communication to/from PME ranks
Mark Abraham [Thu, 13 Apr 2017 23:31:46 +0000 (01:31 +0200)]
Introduced header for communication to/from PME ranks

No functionality changes. This cleans up some structure, and will be
useful for some modernization, use of std::vector, and then new
allocation strategies to suit PME on GPUs.

Eliminated some things in pme-internal.h by moving some declarations
to a header that can be included by the only two source files that are
interested in PP-PME communication. Now gmx_pmeonly() doesn't have to
pass around a large pile of arguments.

Removed a use of typedef struct, and some function parameter types
that no longer need to specify struct in C++.

Removed some unused PP_PME_* constants.

Change-Id: I51629fb6d91b3a486ef24d1f60065e65261d0376

6 years agoFix clang warnings for PME CUDA kernels
Aleksei Iupinov [Wed, 11 Oct 2017 16:16:07 +0000 (18:16 +0200)]
Fix clang warnings for PME CUDA kernels

Change-Id: I28f67c70b1ff4611f2456a5935a727c49e10e691

6 years agoRelax PME solving test complex grid tolerance
Aleksei Iupinov [Wed, 11 Oct 2017 16:28:24 +0000 (18:28 +0200)]
Relax PME solving test complex grid tolerance

PME CUDA solving change (Ic610e7f) tightened the output grid
tolerance from 50 down to 16 ULPs, making one of the LJPME tests
fail in post-submit. This change relaxes the tolerance to 40 ULPs.

Change-Id: Icd0c1aff868e2d1ecb76522a1a2174b3156fc356

6 years agoCleaned up ewaldcoeff for PME-only ranks
Mark Abraham [Fri, 14 Apr 2017 02:23:44 +0000 (04:23 +0200)]
Cleaned up ewaldcoeff for PME-only ranks

Earlier, runner initializes all kinds of PME ranks with the initial
values of Ewald coefficients. The values passed to gmx_pmeonly were
never read - the variables are used only to store new values, which
happens when the PP rank directs the PME grid to switch grids during
load balancing.

Change-Id: Ibe581a7111239f28f874b43dc13dcc6abd025b60