BioD PNPI Git Repos - alexxy/gromacs.git/log

Merge branch release-5-1 into release-2016

Change-Id: I4195829e82c50f3d0aa04644a99ae53beb2e24c3

GROMACS 2016 beta1 release

Change-Id: Ifed8cb0e7ea686ee5f1ade36c6ffe701e60f41f4

Remove static function qualifier in OpenCL kernel utils

Static function qualifiers are supported since OpenCL 1.2. When Mesa 3D
Gallium Clover state tracker is used on AMD Radeon GCN cards, it only
provides OpenCL 1.1 support, so static function qualifiers are not
supoorted.

Change-Id: I6d7aceabefeb0ce825b698962830201728bc56d7

Use STRLEN for buf

Change-Id: Idad6bfb340160c96aa979943aee32c3b909940a5

Fixed large file issue on 32-bit platforms

At some point gcc started to issue a warning
instead of a fatal error for the checking
code; fixed to really generate an error now.

Fixes #1834.

Change-Id: I5827f358b22de516becaac02d81229b82b02297b

Additional Jenkins documentation

Some additional content could be added, but now there are at least a few
words on each of the build types. The overall structure of the
Jenkins-related user-/developer-facing documentation should now be
there, though. Significant part of the Jenkins-related documentation is
in the releng repo, which is now more clearly referenced from here where
relevant.

Related to #1731.

Change-Id: If71cb725f966ea5acd72585fb3a1aa65e1a269ff

Bump clang static analyzer to version 3.8

This seems to avoid false positive issues in AWH code with
the old version in clang 3.4.

Changed the way clang-analyzer is matched to a slave - now
the version is part of the analyzer specification.

Addressed a minor false positive, apparently caused by the analyzer
being unable to be sure that posfn won't change its emptiness.

Change-Id: Ia7e571542b361a03d0cba7c061a1a970c24be44f

Minor update to sphinx dependency

Jenkins is using a new version of sphinx, and that warns that pngmath
is deprecated. So we replace that with the suggested imgmath.

Change-Id: I40dbc5f04565391535ea7d59aba0d23a3a206805

Include releng documentation to developer guide

Include the rst documentation from releng repository into the developer
guide. This mainly adds the infrastructure to do this; actual
documentation formatting is not yet very nice... If releng is not
available, a dummy page is inserted instead.

Split general information about the build tree into a separate file from
gmxVersionInfo.cmake.

Sphinx 1.3 is required to be able to parse Google-style docstrings into
the documentation.

sphinx.ext.pngmath is deprecated in Sphinx 1.4.1, so replace
its use with the recommended sphinx.ext.imgmath.

Change-Id: Ic8638ee341533b07af2f5accc48664f84c1d5cb4

Fix precision loss in tabulated normal distribution

The tabulated normal distribution was generated by
simply integrating a gaussian function, which led to
severe loss-of-precision for double precision tables.
Fixed by implementing proper inverse error functions
in the math module, and using this to fill the table
directly.

Fixes #1930.

Change-Id: I74b72b4a6d36f1eb382c6456501ce5644c92725e

Facilitate linking with libcxx

Using recent clang static analyzer versions seems to be easier with
libcxx, which we should anyway support for building and testing.

Reworked the testing for C++11 support, since it is sensible to first
test the compiler, and then the standard library. This helps users
diagnose problems. Converted this code to a function (for better
scoping), added some docs, and made the semantics clearer. Added
some explicit testing for other non-library C++11 functionality.

Introduced GMX_STDLIB_CXX_FLAGS, so that all the linked executables
can have their sources compiled with any compiler flag that might be
required. Alternatives like requiring the user to modify
CMAKE_CXX_FLAGS, or adding COMPILE_FLAGS properties to targets didn't
seem great. The latter also triggers clang to issue warnings for
source files that are still C (group kernels and TNG).

Introduced GMX_STDLIB_LIBRARIES, so that linking can proceed
correctly.

For example, the check for C++11 support needs to be passed a library
to link during the try_compile(), and the only reliable way for the
user to do that before this patch was to add the linker flag to
CMAKE_CXX_FLAGS, which then leads to clang warning about the unused
linker flag as it compiles each source file. The GMX_STDLIB_*
mechanisms probably also permit users to build against different
versions of GNU libstdc++, which may be useful on distributions like
CentOS, because CMake has no mechanism at all for this.

Updated the install guide to clarify how to choose a standard library
in the various cases. Updated the guide for using GROMACS as a library,
and the template README.

Fixed issue where the template did not have C++11 compiler flags
propagated properly. The template now builds correctly, via both
Makefile.pkg and cmake, both with normal default libstdc++ and libc++
selected via this mechanism and propagated to the installed build
system for the template.

Fixed issue where SIMD suggestion would produce a garbage suggestion
when the linking failed and OUTPUT_SIMD was left unset.

Refs #1745,#1790

Change-Id: Ieef3b47de5c1a00a203baa1b34ebf70535cf5ff0

Flush streams when not writing newline character

Some of our routines use the carriage return without a newline
to keep writing the status e.g. on stderr.
For some operating systems this seems to lead to the output
being cached in the buffers, so this change adds an explicit
fflush() for these print stamements.

Fixed #1772.

Change-Id: I3ad9c4f0e962d8a0b2f8d2341af69f0e3d01a477

Optimized the vsite threading

On many threads a significant part of the vsites would end up in
the separate serial task, thereby limiting scaling. Now two weakly
dependent tasks are generated for each thread and one of them uses
a thread-local force buffer, parts of which are reduced by different
threads that are responsible for those parts.
Also the vsite setup now runs multi-threaded.

Change-Id: I5b117ea2d4af7c1c5844748e2a53a64f69415384

Fix CUDA shmem bug

There was a bug in the (nasty) manual CUDA shmem management.
This does not seem to have affected any results.
Also replaced 2 by c_nbnxnGpuClusterpairSplit where appropriate.

Change-Id: I27cf8b02ad78a6d8ef0825d666103f7494c651a2

Add OpenMP for functions limiting scaling

Loops over number of atoms cause significant amount of serial time with
large number of threads.

Change-Id: I5f7b894c92c3a0aabe417914905b813d5fccc739

Support cmap with QMMM

QMMM only supported bonded interactions using up to 4 atoms.
Now any number is supported and some hard-coded assumptions have been
removed.

Change-Id: I69b78ad5e7f8520cda792a8c2f212adef0a2c7c8

Improved description for GPU/CPU time ratio

Made the output slightly more explicit according to Mark's
suggestion three years ago in a bug report (which wasn't a bug).

Fixes #1291.

Change-Id: I0021afb5b8d964c6cff5bfbff58a5ebef2efdcbb

Fixes for clang 3.7 on PowerPC

We should not test testBits on platforms that don't use it, in
particular, BG/Q.

Our cyclecounter implementation is correct, but clang 3.7 issues
harmless warnings that I do not plan to fix.

Change-Id: Id6b21c3984051ab9ef8598d53746df3b38003aef

Correct and clarify output units for fluct_props

Compressibility and bulk modulus had units reversed in output (values themselves
are correct - just the text units next to them were reversed). Clarify K is in
denominator of heat capacities.

Change-Id: Id68a050b5d547903765526a6fd1c340a9303cb08

Fix Luzar/Chander literature reference in g_hbond

Fixes #1935.

Change-Id: I5307b2d1962730e703865278b21ec1c8022424d8

Fix constants, units and conversion factors

Pressure definition was wrong, should be in bar. There's no point in
giving a standard conversion factor to other units that nobody uses.

Giving the "accuracy" of various quantities is useless without
specifying what the value in parentheses means. We make no use of
those accuracy values anywhere in the code. We should cite the source
and leave it up to the user with some potentially more accurate value
to decide if their value, with _its_ error, is close enough to the
value used in GROMACS to be acceptable.

Change-Id: I417873db5ab01aa53ad409988eb738442f2f0305

Modify lmfit library to avoid clashes with other versions.

Due to the inclusion of the lmfit package in certain operating
systems clashes could occur when linking different codes. In
order to prevent this the exported symbols from this library
were renamed by prepending the names with gmx_ and the .c files
renamed to .cpp.

Now based on lmfit 6.1.

Change-Id: I67ea602d7fdacc08efd88279b50ca42ebe886a4c

Update mdrun termination info

Both the mdrun help and stderr/log messages reported that mdrun
would stop at the next step or next nstlist step, which has not been
correct for some time.

Fixes #1918.

Change-Id: Ia890ffb3ccfdbdbed3d003b149c3cbb55e5c1818

Support for OpenCL CI builds

Add support for OpenCL builds to gromacs.py, and some configurations to
the pre-submit matrix.

Change-Id: I321af28bc8ac20ad4629d40ec68c542ec96ef9c6

OpenMP parallelization for pull

The pull code could take up to a third of the compute time for OpenMP
parallelel simulation with large pull groups.
Now all loops over atoms have an OpenMP parallel version.

Change-Id: I59e65a3e33782828cca58cb843c33afd1502e4b5

Merge branch release-5-1

No conflicts.

Change-Id: I582e9c267588d446f55788828866cc50781d74dd

Make Sphinx build silent for version 1.3+

The handling of the theme has changed, so now the code sets the theme
differently for different versions. This should allow upgrading Sphinx
on Jenkins in-place (older branches do not use Sphinx).

Also, make the copyright year automatically update based on the day the
documentation is generated to avoid manual maintenance.

Change-Id: I180b4307ba5f3d95d39196661831dc40de1424e1

Remove CUDA host compiler consistency checks

Since CMake 2.8.10 the host compiler is set by CMake which effectively
broke our consistency checks. However, these checks are hard to
maintain, and even though CMake does not do any checks we are better off
without this code.

This commit removed the checks, unconditionally sets the
CUDA_HOST_COMPILER variable for CMake 2.8.9 and earlier - code that
should be removed when CMAke 2.8.10 is required.

Fixes #1248

Change-Id: I6c08b59642dd3b5d18c5fe5ac454f19c75718f6a

Fix SIMD detection on AMD AVX CPUs w/o fma (master)

This is essentially the same fix as for release-5.1,
but since the cpu detection code has been rewritten from
scrath this part has been moved to the SIMD module.

Fixes #1906.

Change-Id: Ie25d6013c737279ece902d40d1f54a6e3c5966bb

Avoid allocating heap memory for tabulated distribution

The previous version of the table-generation routine
first allocated memory on the heap and the returned a
reference, which is just complicated, and it probably
led to a small memory leak despite the comment. We
currently return the entire vector by value instead,
since this should be optimized away anyway.

Change-Id: Icf07286e785eeec213ee44d4770df91b71977f19

Mark optional input/output files in gmx order

Mark files that are not always read, or which are not written if the
user does not ask for them, with ffOPTRD/ffOPTWR instead of using
ffREAD/ffWRITE. With changes in 5.1, files marked with ffREAD are
really mandatory, since we check already during command-line parsing
that the file actually exists (so that we can give a good error
message). This change also improves the understandability of the help.

Fixes #1954.

Change-Id: I6a991786b8cc48a61eedad9f855a93559bb8a5a0

Merge "Merge branch release-5-1"

Use HardwareTopology to set nthreads_hw_avail

This removes duplication and confusing logic where the same quantity was
populated using different approaches... This also makes it easy to
reason about and replace uses of this variable with its equivalent in
HardwareTopology, for code that can operate on HardwareTopology alone
instead of the full gmx_hw_info_t.

Change-Id: I543d99b9a2ae5b4ca866c4de6e3e049c271671e8

Remove gmx_hw_info_t::ncore

This information is already available through HardwareTopology that is
contained as a member in the same structure.  Add a helper function to
HardwareTopology to return this information, instead of having a
duplicate member for this.  This makes it easier to reason about the
code.  The code that populated ncore was also incorrect.

Fixes #1903.

Change-Id: Ic6aef660ec07ac4ec481593fb9ae5aa3c87df80e

Workaround for gcc false positive warning

The compiler version that emits the false positive out of bounds
indexing warning and requires the workaround is:
gcc (Ubuntu 5.2.1-22ubuntu2) 5.2.1 20151010
(default on Ubuntu 15.10).

Change-Id: I53e88a0daed7224165e456d76611b2c182d315e3

Improve release workflow testing

Run the regression tests against the installed code when testing the
release tarballs. The parameters are the same as in Test_tarballs_5_1,
except that an extra -mdrun argument is passed to actually test the
mdrun-only build.

Also run the install target for normal builds, although there is no
other testing for the installation yet.

Change-Id: Ib80fe58aa0805b3d97443b97fdb8a59339615a37

Disable multiple ranks for non-MPI unit tests

Make tests not explicitly declared in CMake as supporting MPI only run
on a single MPI rank.  This is already the case for the 'test' and
'check' targets, but manually it is possible to run them with multiple
ranks.  Now all other ranks except the master will exit, and only master
will run the tests.  This avoids potential race conditions in tests not
designed for concurrent execution, e.g., related to file system access.
Some cleanup of the related CMake macros.

Closes #1795

Change-Id: I69e88ba3419cce96eb5b0c7e145643accc65533d

Merge branch release-5-1

Change-Id: I6fdd1b6186911b41d6089c70ee26eff1909fd6e5

Fix multiple MPI ranks per node with OpenCL

Similarly to the thread-MPI case, the source of the issue was
the hardware detection broadcasting the outcome of GPU detection
within a node. The MPI platform and device IDs, OpenCL internal
entities, differ across processes even if both platform and device(s)
are shared. This caused corruption at context creation on all ranks
other than the first rank in the node (which did the detection).

This change disables the GPU data broadcasting for OpenCL with MPI.

Fixes #1804

Change-Id: I90defdcb3515796c46ba89efb0ed1e3c8b1b35f9

Fix issues with failing thread affinity setting

- Fix the conditional in MPI_Reduce() to correctly detect only a subset
  of the ranks failing.
- Ensure that all ranks reach all the MPI_Reduce() calls to avoid
  deadlocks on heterogeneous nodes, where only some nodes could fail the
  consistency checks.  As a side effect, always produce the final
  message about failed affinity settings, together with its advice.
- Only do MPI_Reduce() if there are multiple ranks.
- Fix incorrect #ifdef (caused by rebasing the original change over the
  change that made GMX_MPI 0/1-valued).

The last change seems to fix #1951.

Change-Id: I93c8c4bd6051c9077736f9fc19e6e0637c6d6435

Merge "Merge branch release-5-1"

Disable CUDA profiler when using OpenCL

Replacing GPU_FUNC_TERM with CUDA_FUNC_TERM generates correct empty
implementations and therefore fixes linker errors.

Change-Id: I6485471eeb22bec9e6f0c3528bff7310593e3be6

Fix OpenCL assertion

An incorrect vdw type count checked would cause assertion failure for
some types of runs, but not for others.

Change-Id: Ia941f5a5daf37eeabca45fe4c078f0c6327a1d4f

Fix numerically unstable Gaussian test

Change-Id: If3461ae29584e8afba6d1c8646bfb37aaeb16deb

Fix pull legend with external potential

A typo could make the legend for the pull reference appear in the
pullx output file, while there is (and should not be) reference
output.

Change-Id: I1855c5f3e6f16e1447add84bb18e3e1d36399cef

Fix numerically unstable selection test

Change-Id: Iaea0b6bcbda63b3d8d889a935725a3ece640b9f9

Move out CUDA profiler triggers from NBNXN

The profiler triggering is a general functionality that should not be
tied to the nonbonded module. Hence, it is now moved into the gpu_utils
module and called directly at reset/cleanup.

Change-Id: Ifa862dbcbc6386c514dfcc1f6a5169ea6ae8d09f

Allow rcoulomb>rvdw with PME

We have Coulomb PME + LJ cut-off kernels that support rcoulomb>rvdw
or PME load balancing. Now we allow this setup to be chosen also
through mdp options. This is mainly a straightforward change, only
the Verlet buffer calculation needed to be adapted to support
different buffer lengths for LJ and Coulomb.

Change-Id: I9d8176656435fbdb4ec48b3dd2c7128d13c5ef07

Replace libxml2 with tinyxml2 for use in test code

Building libxml2 on complex machines has been problematic for users
and developers, and there is no pressing reason to need the full
capabilities of libxml2 for reading and writing data for test
cases. Support for detecting libxml2 is retained, because other code
in Gerrit might plan to use it, but it currently is not needed for
anything.

This is version 3.0.0 of tinyxml2, modified only so that
XMLPrinter::PrintSpace will write XML files with nesting indentation
of 2 spaces rather than 4, to match existing XML files in the GROMACS
repo. Also, a cppcheck suppression was added, and fix proposed
upstream that helps avoid it.

The resulting XML is identical in format to the current form in the
repo, except that empty text fields are rendered with
'<string></string>' rather than using the '<string/>'
shorthand. Those have been regenerated.

One known limitation is that '\r' is not handled correctly in text
nodes (string or text block), as it is converted to '\n'. libxml2 did
not handle it correctly in text block either, and we don't need this
behaviour to work, so it is no longer tested for either case.

Another limitation is that a non-CDATA text element that is not empty
and contains only whitespace is written correctly by TinyXML2, but is
parsed to an empty element upon reading. So GROMACS cannot rely on
refdata that wants to write a whitespace-only non-empty String (a
TextBlock is OK), which we might want to do some time. Attempting to
do so throws an exception, so we can't inadvertently run into trouble.

Some of the helper code that reports how strings in refdata do not
match has been slightly improved, so that the string that doesn't
match is wrapped in single quotes, so that the manner in which
whitespace is not matched can be more easily seen on the terminal.

Refs #1947

Change-Id: I6153136b67b41e7141fc3f1fc8ee4005a72c90f1

Tune nstlist increase for KNL

Change-Id: If4162c307fd9b31f7a76a18ee8a1fa45830559e1

Fix parse_digits_from_string

New GPU-passing functionality wasn't checked for empty string like the
functionality that it extended is checked for. Noted TODO for future.

Change-Id: I587e526802a4e880fa966836b555fb07e2b3ad30

Broadcast pull external potential provider string

The pull external potential change was added without broadcasting
the string that tells what the external potential provider is.
Note that all other strings in inputrec are in symtab, but this one
is not.

Change-Id: I00959754bfe9ad96fe2c0b7682514f893df78f54

Fix mdp label generation

The extra colons meant the rst didn't generate
the label that is expected.

Change-Id: Icbb5bb8aa674f043ae45e474ced8e6f5706589b4

Fix gmx hbond group overlap check

gmx hbond does not support partially overlapping analysis groups.
The check in the code was broken and never caught this, resulting
incorrect output that might OK at first sight.
Also corrected bitmasks = enums that (intentionally?) seemed to give
correct results by not using non power of 2 enum index entries.

Change-Id: Ic1643f1d552e35d35873885a3cdf49f19ec66ae3

Make thread affinity failures always end up in log

Remove calls to md_print_warn(NULL, fplog, ...), which were used for
cases where only some of the ranks could fail. But if non-master rank
failed, the error went only to stderr, not into the log file. Make this
work more uniformly such that the error always ends up in the log file.
The approach could possibly be generalized (it is now local to
threadaffinity.cpp, and only works for warnings where the text is the
same on each rank), but that is probably easier after the logging is
using C++.

Add some trailing newlines for consistent output from md_print_warn().

This also makes all md_print_warn/info calls use the same pattern, which
makes things easier to understand, and allows replacing them with a
simple object.

Related to #1083.

Change-Id: I03a3524ed883bed0c5b039836e9d1741c672d97d

Reduce compilation coupling of SIMD module

Analysis tools that call bonded functions should not have to include a
header that unnecessarily includes the SIMD module.

Change-Id: I116429b4fc9f6e0a67735d58ac8ddd4aa9b78a1e

Fix AVX512 suggestion

Change-Id: Id8a9d4b9c9886e4a023f3d8a9a035dc324029683

Remove idioms deprecated in C++11

- use noexcept instead of throw()
- don't rely on default copy/assign if either one or the destructor is
explicitly defined

Enforced with clang -Wdeprecated

Change-Id: I4d2f32d6c3e880092fb27d81834c458f788a9e3f

Add QMMM checks

QMMM only works with cutoff-scheme=group and dynamics. grompp and
mdrun now check for this.
Also fixed potential print of NULL string in init_gaussian.

Fixes #1940.

Change-Id: I8215e339070d811ba07d17d743061b18b665a33b

Merge branch release-5-1

Several changes from release-5-1 have been fixed separately
in master, such changes are omitted here.

Change-Id: Ib03a2091ad6f76e785f7a6cc9ea91493d246da3b

Corrected value of electric conversion factor f in the manual

- As suggested by Christopher Neale
- Made a new latex command for the value so that it can appear
  consistently throughout all uses in the .tex manual
- changed the value to f = 138.935 458(9), which is value and
  standard deviation calculated from NIST 2010 CODATA, as also
  used by GROMACS in units.h (full value calculated from
  numbers in units.h is f = 138.935 457 839 ... )
- If I calculated correctly, the pre-June-2015 value of the std-dev
  was correct, just the last two digits of the value itself
  were swapped in some places 58(9) <-> 85(9)
- I believe commit 1ee3b15530a5 accidentally overwrote the std-dev
  with digits from the value itself

Change-Id: I293768b657a81af0a4dc0865db8c4a1b0eebd4ad

Spelling fixes

Thanks to: Nicholas Breen / DebiChem
https://anonscm.debian.org/viewvc/debichem?view=revision&revision=6293

Change-Id: I5bfecc4cc623276b65cd2dd6e455eff697d4a227

Do not create empty man/man7 directory

Thanks to: Nicholas Breen / DebiChem

Change-Id: I42d0bf14ea774c623ccf9af687ac54bbe634da4c

AVX512 transposeScatterIncr/DecrU with load/store

Also remove one more mask

Change-Id: I0d27eb42eead92d2f50725b2f3831af7e9ee229e

Fix hwloc narrowing warning on 32-bit architectures

hwloc always uses a 64-bit integer for cache size, while
size_t might be 32 bits on a 32-bit architecture. Since we
do not expect any 32-bit architectures to come with more
than 4GB of cache, we simply cast it explicitly to size_t.

Change-Id: I35b064de60a6b20dcf9ab6af751b4721037e271c

Mark more variables as advanced

CMake 3.x introduced a few more non-advanced CUDA cache variables that
most users do not care about, so this commit hides them.

Refs #1764

Change-Id: I926b03ece080d8f8255e37dc5471275d48956d49

Fix compile error for IBM VMX on ppc64

gcc-5.1 does not provide vec_mul(), only vec_madd().
This has already been fixed in the master branch with
the new implementation.

Fixes #1812.

Change-Id: I02c691b16bc0827cfa799aa99fd4fc7b84045e3f

Fix SIMD detection on new AMD AVX CPUs w/o fma

We earlier assumed that all AMD CPUs had fma
support and could use AVX_128_FMA, but with this
change we properly detect it and revert to
AVX_256 otherwise.

Fixes #1906 for release-5.1.

Change-Id: I803a6bfa75c5069b73023688a477cfca03c1ddcd

Correct nrdf for 1D/2D systems

With COMM removal, grompp would subtract degrees of freedom also for
VCM groups with fully frozen dimensions, i.e. 1D/2D systems.
Also fixed division by zero for groups with #DOF=0 with VV.

Fixes #1923.

Change-Id: I0ba2535df0495947d9bbb6ee1ef29f519635c221

Fix unit test segfaults with gcc-6.0

Work around a bug in gmock-1.7 by adding the flag
-fno-delete-null-pointer-checks to for the gmock
and testutil sources.

Fixes #1911.

Change-Id: I3e2d3fde99898191437d9eed986659f6f6fb056a

Fix incorrect return type for scalar version of simd andNot

The return type for the integer version was incorrectly set
to float. Fixed by properly adding separate integer, float
and double versions. This has never been used in any higher
level code yet, so it won't have caused any problems.

Refs #1930.

Change-Id: I84a4dfe51cf898664631ec2b0313ecf8e11a4a3a

Added references to ProtSqueeze and g_membed to the user guide

Change-Id: Ia05a3710eb170173bae9257993f05d2377c02db6

Fix coolquotes

The unseeded generator produced always the same number.

Change-Id: I7ce678e61b75aa789e7535bfedb0021c1f2117e7

Revert to treating -1 as "generate a seed" in .mdp

Commit 2d0247f6 changed the grompp behaviour while refactoring the rng
seed setup, without mentioning it in the docs or commit message.
Several of the tools also had their functionality changed, but in most
cases there was no previous documentation and now there is
some. However, we can't change grompp behaviour without catering for
all the scripts whose behaviour would change.

This change restores the previous functionality for grompp.

Change-Id: Id77098e1f03ebf96207f8c9908b2dada28c59e6a

Fix includes in readpull.cpp

Change-Id: Iaf055fc196a17dd59795c6579caa0afc210a0c82

Add option for external pull potential

An module other than the pull code can now provide the potential
and force for a pull coordinate, when the pull coordinate type is set
to external-potential. It is the responsibility of the module to call
a function that applies the scalar force to the atom in the groups
involved in th pull coordinate. A registering function has to be
called by the provider module at setup. This function checks the
consistency of the pull setup and checks a provider string the user
provided against the string of the providing module.
This functionality is intended to be used by the AWH module.

Removed the set_pull_coord_reference_value() function. Instead of
externally modifying the reference value, an external pull potential
should be used.

Change-Id: I231fe43fe3a2148a7a77d979b0fdb234d3de5c9b

Add test and improved docs for custom checkSequence

I tried to use this, but ran into problems if I passed a non-unique id
that was not NULL to the compound checker. Passing NULL is the
behaviour for the implementation of checkSequenceArray, and that seems
to be often appropriate here, too.

Change-Id: I9b02e4632085ad995ef67bbb52959f1533d56c34

Revert behavior for handling SETTLE errors

SETTLE errors were changed to fatal in 5.1.2. This causes problems with
minimization of some systems, since a SETTLE error is likely to occur at the
beginning of the minimization. This commit reverts that such that
it only issues a warning as before.

Fixes #1915.

Change-Id: I43ea1878cb61aecf5ae2c5ebcc67fcbda27ba50f

refactor CUDA texture-based table initialization

Also makes conditional the allocation of non-bonded parameter table and
related CUDA texture, skipping it when combination rule kernels are
used.

Change-Id: I6f7e25ff3906d8f92da22d2a227292bcb9e9fc74

avoid CPU oversubscription with GPUs

If the user specifies the number of OpenMP threads on the command line N
such that N*#GPU > #hardware threads, mdrun will determine the number of
thread-MPI ranks to start based on the number of GPUs without taking
into account the total number of threads that such a run would use. This
will result in CPU oversubscription.

This change will reduce the number of tMPI threads started to prevent
oversubscription.

Fixes #1338

Change-Id: I6bceaa2230686fee2e410caf8ae97c43847a5c28

Fix multiple tMPI ranks per OpenCL device

The OpenCL context and program objects were stored in the gpu_info
struct which was assumed to be a constant per compute host and therefore
shared across the tMPI ranks. Hence, gpu_info was initialized once
and a single pointer pointing to the data used by all ranks.
This led to the OpenCL context and program objects of different ranks
sharing a single device get overwritten/corrupted by one another.

Notes:
- MPI still segfaults in clCreateContext() with multiple ranks per node
  both with and without GPU sharing, so no changes on that front.
- The AMD OpenCL runtime overhead with all hw threads used is quite
  significant; as a short-term solution we should consider avoiding
  using HT by launching less threads (and/or warning the user).

Refs #1804

Change-Id: I7c6c53a3e6a049ce727ae65ddf0978f436c04579

LJ combination rule kernels for OpenCL

The current implementation enables combination rules for both AMD and
NVIDIA OpenCL (also ports the changes to the "nowarp" test/CPU kernel).

Like in the CUDA implementation, all kernels support it, but only for
plain cut-off are combination rules used.

Notes:
- On AMD tested on Hawaii, Fiji, Spectre and Oland devices;
  combination rules in all cases improve performance, although combined
  with the i-prefetching, the improvement is typically only ~10%.
- On NVIDIA tested on Kepler and Maxwell; in most cases the combination
  rule kernels are fastest.
  However, with certain inputs these kernels are 25% slower on Maxwell
  (e.g. pure water box, cut-off LJ, pot shift), but not on Kepler.
  This is likely a compiler mis-optimization, so we'll just leave the
  defaults the same as AMD.

Change-Id: I05396e000cdf93c1d872729e6b477192af152495

Re-enable i-atom type local mem prefetch in OpenCL

For reasons unknown this has been disabled in the original OpenCL
implementation. However, it turns out that prefetching does have
substantial performance benefits, especially on AMD (>10%) and in some
cases on NVIDIA too (although not on Maxwell).

This change re-enables prefetching code-path and turns it on
for AMD devices. For NVIDIA the decision will be revisited later.

The GMX_OCL_ENABLE_I_PREFETCH/GMX_OCL_DISABLE_I_PREFETCH environment
variables allow testing prefetching with future architectures/compilers.

Change-Id: I8324d62d3d78e0a1577dd3125edf059d3b311c2f

Correct mdrun tMPI (non-)parallel error message

Fixes #1931

Change-Id: Ifad46c7f62099a2cd80d70ccbe46bf3f2b5751e0

Make includesorter.py work with Doxygen 1.8.10+

Newer Doxygen versions write relative paths to the XML output, so make
the filtering that limits what to parse work with relative paths, using
XML entries that should give a relative path with all Doxygen versions.

Change-Id: I6172c625cd7e5293764b9a4dfeacf1c7de10ce78

Describe AMD APPSDK 3.0 issues in the guide

No better solution has been found to the issue, so we can only warn
users and provide advice for a workaround.

Fixes #1921

Change-Id: I4f6c8efa72e02d39ccb59977a2093802c0590563

Fix error reporting for documentation build

Make explicit file handling in the documentation build work, now that
the current working directory is managed separately. In some cases,
this could cause missing some warnings (probably mostly from
check-source) and marking a build incorrectly as successful. However,
the main effect was that reporting of the unsuccessful reason back to
Gerrit was less than ideal.

Fix issues that had been introduced while the checking was not working.

This can possibly be improved further by adding methods in releng for
wrapping all this file access, but that can wait for later, once there
is a bit of time to design a good, stable API for that.

Change-Id: I2f1dc927a2134365dba15a6b7dcec1a754b58433

Clean up vsite code

No functionality changes. Added some const correctness. Reduced the
scope of a lot of variables.

Change-Id: Ia6caba6686ecf9348223932f7cc812e601d0109c

More support for release workflow

- Add a matrix used for the deployment tests for tarballs.
- Make gromacs.py support builds with static linking, make it
  possible to run the tests (including regression tests) with 'make
  check', and do an install with a normal + mdrun-only build.
- Adapt the website build script to releng changes, and do not run all
  the Doxygen targets twice for the website build.

Some TODOs in gromacs.py identify what would still be necessary to get
to the level the Test_Tarballs_* jobs currently do, but that can also be
improved separately.  Now this, together with the releng changes,
produces something useful reasonably robustly.

Change-Id: Idca8974c5c9e2ee20e04f0a309e0f306a9c615d6

Ignore empty GMXLIB

If the GMXLIB env.var. was set, but empty, it triggered an assert
(at least) in pdb2gmx. Now this case is silently handled as if
the variable wasn't set at all.

Part of #1928.

Change-Id: I51646ec659d119b20b7cb429f9292bffa5775aaa

Add real/SIMD template for LJ-14 pairs

Added a real/SIMD templated code-path for plain LJ 1-4 pair
interactions without free-energy, virial and energy.
We should add virial and energy, but the virial requires a different
treatment of the shift forces, that we should introduce at the same
time for angles and dihedrals.
The gives a few factors speed improvement. The main improvement comes
from simplified analytical LJ instead of tables; SIMD helps a bit.

Change-Id: Ifdefd875e32f6c5b7d609c3bd66a864fe77fc13e

Removed srenew that was making array smaller incorrectly.

An array y was first allocated to 6 entries, then reduced
to 3 entries using srenew, but after that the indices
3, 4 and 5 were used anyway. Removing the srenew fixes the probem.

Fixes #1920

Change-Id: I3f35f0af3f56d33435bf54a9c7cda6273fb3f05a

Avoid assertion failure in pme_loadbal_do

When ir->init_step % ir->nstlist != 0 an assertion would fail in
pme_loadbal_do. If it does not fail every time, the assertion can safely
be replaced by a conditional return, which is now done in this case.

Fixes #1870.

Change-Id: Ie586022c87d3993f2de5a6fdf9210ae56aa7855b

Make gmx vanhove work without PBC

Change-Id: I37d59fc7edf6d6da36778ac3f342c71a876e2872

Clean-up code for swapcoords.

Fixes one of the TODOs left from change gerrit.gromacs.org/#/c/5660/

Change-Id: If9f443447f2cfd7fb757a386447a51a9f958dd41

LJ combination rule kernels for CUDA

Implemented LJ combination rule parameter lookup in the CUDA kernels
and enabled it for plain LJ cut-off, as was already present in the
SIMD kernels.

Change-Id: Ic1222e9433eb21e8bd1fb81658ebbbfe42c1d2c2

Moved pull initialization back to mdrunner

Commit 29943fe5 moved the init_pull and init_rot calls from
mdrunner() to do_md(), but both are needed in energy minimization
as well and init_pull should be called before init_constraints().
Added an assertion to init_constraints() for pull initialization.

Fixes #1924.

Change-Id: I9420940f772afc2be93416619f464a6ef7472372

Add more const correctness

This helps reduce the size of a future change to how we handle reading
and writing conformation files.

In particular, the way we write symmetric low-level I/O routines that
do both reading or writing according to the value of a parameter
requires that parameters for values to be written are passed as
non-const at some level of the call stack. In several cases, this
change moves that point lower down, so that routines whose job is
writing complex data to files take const parameters. Improving this
helps understand what is going on when e.g. an analysis tool reads in
coordinates, perhaps modifies them, and then writes them. The down
side is that we need to use some const_cast as we get close to the
low-level routines.

Change-Id: I5c22d8681fa5f247f302409bf67b062ebc9fe766

Fix warning from GCC 6

These omega variables could not actually be used when uninitialized
but the code is anyway improved by this change.

Change-Id: Ic038ecf1f5ae1ff1f43bbc77b6de023d94b020b2

Improved some mdrun integration tests

Made some better naming.

Avoided using the whole of OPLS/AA just to get water parameters, which
may have been unstable when calling grompp many times in a binary, for
reasons unknown.

Removed warning suppressions for unsupported icc compiler versions.

Change-Id: I5107fca64f6a73e3c0b85146f28347453ded705a