Mark Abraham [Thu, 21 Dec 2017 10:56:59 +0000 (21:56 +1100)]
Separate canDetectGpus and findGpus futher, and fix tests
Renamed detect_gpus to findGpus so that no code can silently call
detect_gpus while forgetting to call the required canDetectGpus first.
Some test code is updated accordingly, which should have happened
earlier. The function with the new name now needs no return value, so
the formerly confusing return value of zero for success is no longer
present.
Shifted some more responsibilities from findGpus to canDetectGpus, so
that the latter now has responsibility for ensuring that when it
returns true, the former will always succeed.
Fixed tests that compile with CUDA, but cannot run unless there
are visible comatible devices and a valid context.
Refs #2347, #2322, #2321
Change-Id: I34acf8be4c0f0dcc29e931d83c970ba945865ca7
Roland Schulz [Wed, 27 Dec 2017 18:59:04 +0000 (10:59 -0800)]
Replace GMX_ALIGNED with alignas and SIMD alignment define
This allows each platform to define its required or
preferred SIMD alignment, and we avoid using alignments
larger than what might be supported on each architecture.
Also fix overalignment in pairs.
Fixes #2365
Change-Id: I4793adf31d186eade8a1fd8c920ab75c685ad53f
Mark Abraham [Tue, 26 Dec 2017 13:01:43 +0000 (14:01 +0100)]
Silence POWER8 compiler warnings
Change-Id: Id116b048d775d36ce8ac942380cfd64deed3bb7e
Roland Schulz [Tue, 26 Dec 2017 18:51:23 +0000 (10:51 -0800)]
Fix quote
Change-Id: Ic763145c231951cd8e61cf36f3486de7a3d30e30
Erik Lindahl [Tue, 12 Dec 2017 19:29:41 +0000 (20:29 +0100)]
Fix builds on ARM & clarify (ARM) GPU support
Fixed a typo in architecture.h that prevented
the Neon Asimd instructions from being selected,
and updated the CPU brand detection to also look
for a new label with Tegra X1 on Ubuntu 16.04
Clarified in error messages and documentation that
Gromacs in fact does not build all supported GPU
architectures by default, explain the common cases
when things might fail, exactly what the user
should do to enable the support, and how the
support strings should be formatted.
Fixes #2287.
Change-Id: I87a2eb81ee11b78f072e3ef359a00c75eb7ec24b
Mark Abraham [Fri, 22 Dec 2017 14:11:52 +0000 (01:11 +1100)]
Merge branch release-2016 into release-2018
Change-Id: I909443584916c574f94b38d0aab2701163faecad
Mark Abraham [Thu, 21 Dec 2017 14:00:56 +0000 (01:00 +1100)]
Merge branch release-5-1 into release-2016
Change-Id: Ic4aa87c01dfe281c9c8aa31fbc0f7f9fcbe9752d
Mark Abraham [Thu, 21 Dec 2017 13:56:25 +0000 (00:56 +1100)]
Bump minor version for theoretical new release
Change-Id: I54152717d87ad84e7fa5549887b587aa10fe43e4
Pascal Merz [Thu, 14 Dec 2017 22:55:34 +0000 (15:55 -0700)]
Documentation and fixes for physical validation
Addresses #2349
Adds documentation for the physical validation suite in
docs/dev-manual/physical_validation.rst
As this was misunderstandable, changed the default behavior of
`make check-phys` and `make check-all` to actually run the simulations.
This might take very long, but since the physical validation tests need to
be turned on explicitly via cmake option, the chances of somebody using the
tests by mistake are low. The `check` targets are:
* `make check`: Run unit and regression tests (unchanged)
* `make check-phys`: Run simulations needed for physical validation, then
run physical validation tests
* `make check-phys-analyze`: Only run physical validation tests, assuming
that simulations were run previously and are available.
* `make check-all`: Combination of `make check` and `make check-phys`
Additionally, `make check-phys-prepare` can be used to prepare GROMACS
input files and a script to run the simulations needed for the physical
validation tests.
Bugfixed dependecy issue that attempted to run validation test before
target `gmx`.
Fix in physical_validation package to avoid including matplotlib by default
(and hence requiring X and Gtk).
Improved output of physical validation script (update while running
simulation, info about script after preparing input files, etc).
Additional minor changes not mentioned in #2349:
* Allowing name of test to be different of directory of input files, which
allows to define several tests on the same input files.
* Adapted some tests - more thorough checking of ensembles (check
temperature- and pressure-dependence separately *and* collectively), and
updated tolerance levels.
* Bumped physical_validation package to newest version to include bugfixes
and stability improvements.
Change-Id: I7219f5392ed068ba1b2e13b30efe5aa4e36ff586
Mark Abraham [Thu, 7 Dec 2017 04:36:49 +0000 (15:36 +1100)]
Update double-precision test configurations
These changes improve coverage of double precision, using more release
mode, particularly with latest gcc and icc, and using 128-bit SIMD,
which have been cases that were buggy recently. The other aspects of
the configurations that have been modified have been
non-critical. Where appropriate, brief rationales are recorded. This
resolves an old TODO item in the post-submit matrix.
Fixed a sign mismatch in initializing an OpenCL variable that didn't
need to be initialized.
Noted relevant new TODOs.
Refs #2300, #2325, #2326, #2334, #2335, #2336, #2337, #2338
Change-Id: I131fa1a6776d1e7809799c3f931a1fc8100fcdc9
Mark Abraham [Thu, 21 Dec 2017 08:47:01 +0000 (19:47 +1100)]
Version 5.1.5
Bumped SOVERSION_MINOR and REGRESSIONTEST_HASH
Change-Id: Ie62ecacbe53f4f39e4592a7593ed53ffe47fd642
Roland Schulz [Wed, 20 Dec 2017 19:42:17 +0000 (11:42 -0800)]
Update checks for BuildOwnFFT
- Allow on Windows (e.g. WSL, Mingw)
- Disallow with Ninja (broken)
Fixes #2356
Change-Id: I8ac5dd520f92b882dcaeb009792fae2d6e9f0062
Aleksei Iupinov [Wed, 20 Dec 2017 13:16:38 +0000 (14:16 +0100)]
Set GMX_GPU_AUTO to FALSE with GMX_GPU defined
Refs #1985, #2357
Change-Id: I5cada97015ee94717ea6eb988b3a84a351f11293
Mark Abraham [Wed, 20 Dec 2017 11:13:00 +0000 (22:13 +1100)]
Use new defaultRealTolerance() to stabilise some tests
The single-precision values in these tests sometimes failed to compare
equal with string-ified versions when the test was run from a
double-precision build on Windows. That difference may have originated
in how a rounding mode was implemented in different libraries, but
ultimately doesn't matter because the value should not be compared as
if it was computed in double precision.
Change-Id: I9b7c2b9409145cc579229f0866935754e3a9dcac
Mark Abraham [Wed, 20 Dec 2017 12:33:56 +0000 (13:33 +0100)]
Merge "Merge branch release-2016 into release-2018" into release-2018
Mark Abraham [Tue, 19 Dec 2017 13:23:59 +0000 (00:23 +1100)]
Merge branch release-2016 into release-2018
Change-Id: I2c3aa754de1b8ff971854740da9815fff8a41f0d
Aleksei Iupinov [Tue, 19 Dec 2017 15:40:23 +0000 (16:40 +0100)]
Remove potentially wrong "per user request" note from npme reporting
Refs #2204
Change-Id: Idd2127737b7b1af9cec9f7b547478b8b82cdb59a
Berk Hess [Tue, 19 Dec 2017 10:55:25 +0000 (11:55 +0100)]
Make acceleration correction VCM mode work
The new acceleration correction VCM mode did not actually correct
the coordinate for the acceleration, since a null pointer was passed.
Introduced an extra CGLO flag to allow for correction of the
coordinates, but leave the initial coordinates unaffected.
Change-Id: I673793902df7241085fff20c63cf3ce88ef60313
Mark Abraham [Tue, 19 Dec 2017 01:03:45 +0000 (02:03 +0100)]
Third beta release of 2018
Change-Id: I95296a74713ec41a18dd5c866e36b6c0ab66046b
Mark Abraham [Mon, 11 Dec 2017 05:36:43 +0000 (16:36 +1100)]
Fix table tests and improve table construction
Since compilers are allowed to use different FMA constructs, we
now allow the consistency check to deviate a few ulps.
For sinc and other extreme functions that oscillate, the
scan over the definition range to locate the minimum quotient
between the 1st and 4th derivative to set the table spacing
exposes some delicate errors. Basically, it is not possible
to have arbitrarily low relative errors for the derivative
for a function that has large magnitude in the same place.
For now we reduce the test interval for sinc(); this should
anyway not be relevant for normal well-behaved MD functional
forms.
Fixes #2336.
Change-Id: I5f999ae871ae21ddc5b59cf78ad8bd27fe2df622
Berk Hess [Fri, 15 Dec 2017 15:12:15 +0000 (16:12 +0100)]
Make oversubscription warning consistent
The hardware thread oversubscription warning was only issued
with OpenMP and without separate PME ranks. Now it actually reduces
the thread count over the physical node.
Also moved the thread affinity up to the earliest possible point.
Refs #2345
Change-Id: Ifdf62c723fd87b0ddaab0df1e2f1bf36b461ea33
Carsten Kutzner [Mon, 18 Dec 2017 16:13:11 +0000 (17:13 +0100)]
Fix electric field .mdp documentation
Typo accidentally introduced in
ca9e2450877d00e49
Change-Id: Idc1bf111b65c4e365aad61b2fb83a447d7c58bfd
Erik Lindahl [Sun, 17 Dec 2017 10:03:17 +0000 (11:03 +0100)]
Use reduced default tolerances for tpx comparison
The tolerances for gmx check are mainly intended
for handling slight statistical deviations, but they
can hide differences between tpr files, when the
user likely wants exact checks. This changes
changes the default relative tolerance to 0.000001
and the absolute tolerance to zero, so that we only
allow for minor differences due to compiler optimization.
Fixes #2024.
Change-Id: I55b882a194d931bf5c36541e25339b6e1eb0a1e4
Roland Schulz [Fri, 15 Dec 2017 02:19:11 +0000 (18:19 -0800)]
Fix cmake error with emacs lock files in place
Change-Id: If95d6a77303373c7a0bac5299e6f9dacd1fac81c
Roland Schulz [Sun, 17 Dec 2017 20:01:29 +0000 (12:01 -0800)]
Fix typo introduced in
b1b1163a245
Change-Id: I288caff40c6252fd059bc2601b4f5afd438aba08
Roland Schulz [Thu, 14 Dec 2017 17:51:48 +0000 (09:51 -0800)]
Change avx_128_fma build to double
Important for:
- Simd4N=Simd4
- sizeof(SimdInt32)!=4*GMX_SIMD_REAL_WIDTH
Change-Id: I79efca09547d84d05533c963d53ed6bf7daf3e7b
Roland Schulz [Mon, 11 Dec 2017 11:07:51 +0000 (03:07 -0800)]
Support Simd4N for SimdRealWidth<4
If the SIMD with is smaller 4 but Simd4N is supported
then use Simd4 for Simd4N.
Also move Traits into internal namespace to signify that they
are not intended for usage outside of the simd module.
Fixes #2327
Change-Id: I3d49c57cebc5d565df442d01e322c89312771699
Roland Schulz [Fri, 15 Dec 2017 01:19:48 +0000 (17:19 -0800)]
Remove incorrect/misleading OpenMP message
Change-Id: I69e8b84a593aeb3dff868a169ac88f17c101ec59
Roland Schulz [Fri, 15 Dec 2017 00:49:56 +0000 (16:49 -0800)]
Revert "Use table Ewald for Skylake"
This reverts commit
80c3f0d8ec228c6266ab721cadcb3dca48aad1d1.
After the bugfix of table (
12e9ea41a9cee) this isn't the right
choice anymore.
Change-Id: I4c916c1d038c0e7501b1bd9f91ebc89f5afbd2cd
Mark Abraham [Thu, 14 Dec 2017 11:12:34 +0000 (22:12 +1100)]
Make OpenCL implementation of gpu_utils tests conform
These fell out of sync with the CUDA implemention, and can only be
noticed if you run tests from an OpenCL build with no devices
available.
Change-Id: Ib196cf498c2537f814c9153b0279af8fdf01234d
Mark Abraham [Tue, 12 Dec 2017 09:35:38 +0000 (20:35 +1100)]
Avoid confusing message at end of non-dynamical runs
EM, TPI, NM, etc. are not targets for performance optimization
so we will not write performance reports. This commit fixes
and oversight whereby we would warn a user when the lack of
performance report is normal and expected.
Fixes #2172
Change-Id: I1097304d79701be748612510572382729f7f26be
Mark Abraham [Thu, 14 Dec 2017 03:04:13 +0000 (14:04 +1100)]
Fix return values of frame-reading functions
This function was based on read_first_x that returned the number of
atoms, and was documented to do the same, but has always returned a
logical boolean about whether a frame has been read. This led to
aspects of gmx spatial and gmx trjcat -demux being broken.
Fixed by returning a proper bool, and fixing the remaining logic that
used the return value in a non-boolean sense.
Refs #2157
Change-Id: Ic871b56f68c7dbc654ab11b34ff82932353e6ceb
Berk Hess [Tue, 12 Dec 2017 15:10:08 +0000 (16:10 +0100)]
Require -ntmpi with setting -ntomp with GPUs
With GPUs and thread-MPI, setting only -ntomp could lead to
oversubscription of the hardware threads.
Now with GPUs and thread-MPI the user is required to set -ntmpi when
using -ntomp. Here we chose that to also require -ntmpi when the user
specified both -nt and -ntomp; here we could infer the number of
ranks, but it's safer to ask the user to explicity set -ntmpi.
Note that specifying both -ntmpi and -nt has always worked correctly.
Fixes #2348
Change-Id: Iad380721807f5c53b8c70808cea75c5f29341a8f
Mark Abraham [Tue, 12 Dec 2017 19:41:16 +0000 (20:41 +0100)]
Make AVX-512 CMake detction work
Both inline assembly and the support flag have to be set for the
timing code to be compiled.
Also fixed tabs and made the warning about the number of FMA units
only a status message - it's not any more important than anything else
we make assumptions about.
Change-Id: I1a430d12bc52cbea2495d2d3837c851761f552d8
Paul Bauer [Mon, 11 Dec 2017 11:53:12 +0000 (12:53 +0100)]
Remove PBC before generating TPR with group scheme
Ensure that all molecules have been made whole before generating the
run input file when using the group scheme, to avoid error
messages for large charge groups when molecules are broken
over PBC boundaries.
Fixes #2339
Change-Id: Iecba013826cbe46e7f70bd674935f9946806ee2e
Roland Schulz [Wed, 13 Dec 2017 21:00:12 +0000 (13:00 -0800)]
Simplify CMAP loop
Fixes #2350
Change-Id: Id87a2105012d541f77d1c278029fe36b874328a9
Erik Lindahl [Tue, 12 Dec 2017 14:54:56 +0000 (15:54 +0100)]
Work around AVX-512 issues in gcc-5.4 and 7.1
Fixes compilation issues with mixed and double precision builds using
AVX-512 SIMD with gcc-5.4 or gcc-7.1. Also tested with gcc-6.3, and
Debug as well as Release builds for all three versions, all of which
now pass the simd unit tests.
Fixes #2325.
Change-Id: I59c3ae0467b51412d1ebbb5b57a248534288a5db
Erik Lindahl [Tue, 12 Dec 2017 21:51:18 +0000 (22:51 +0100)]
Fix PBC error in gmx_spatial
Fix provided by Alexey Anikeenko.
Fixes #2157.
Change-Id: I2ac8a4ffac5acb0f3e432036ded3b380d720a719
Mark Abraham [Thu, 14 Dec 2017 08:26:49 +0000 (09:26 +0100)]
Merge "Merge branch release-2016 into release-2018" into release-2018
Erik Lindahl [Tue, 12 Dec 2017 22:00:49 +0000 (23:00 +0100)]
Documented power spectrum to gmx velacc
This option was saved by the bell and my nostalgia for
power spectra. The usual recourse is that we get rid of
functionality that nobody takes the time to document.
Fixes #2019.
Change-Id: I09fc5367a59300e2fca780125893fad5c1d72b81
Roland Schulz [Tue, 12 Dec 2017 19:17:16 +0000 (11:17 -0800)]
Add work-around for ICC 18.0/.1
Also remove old work-around for unsupported version.
Change-Id: I927749b55f76b30b7d9140f84fa6216c725b5852
Erik Lindahl [Tue, 12 Dec 2017 14:31:07 +0000 (15:31 +0100)]
Require TPR file for gmx cluster
The program crashes without it, so it wasn't
optional.
Fixes #2170.
Change-Id: I63af728eea047dd4c3b11d890d507473109a7279
Erik Lindahl [Tue, 12 Dec 2017 00:15:00 +0000 (01:15 +0100)]
Disallow ascii formats for gmx trjcat
Since trjcat (deliberately) does not use any TPR file,
the tool can't handle trajectory formats such as GRO
or PDB where atom/residue names are needed.
Fixes #2225.
Change-Id: I55cbfd5f8a3909c1f76e63fa402f0c3243a6f7c7
Mark Abraham [Tue, 12 Dec 2017 11:24:15 +0000 (22:24 +1100)]
Improve grompp missing-parameters error message
If an interaction entry had parameters but not the function type, then
the error message has been confusing. Note that even when only one
function type is implemented, the field is still required, which makes
for ready extensibility.
Refs #2144
Change-Id: I356e14541d4aaffad054d5ecfb8a9e3cb04cd25f
Berk Hess [Mon, 11 Dec 2017 13:17:59 +0000 (14:17 +0100)]
Check for large energy at first step
Also added step number to fatal error message.
Fixes #2333
Change-Id: I6e8aa1fac3a3c9a358b4046de5c8a3547ae14b15
Mark Abraham [Tue, 12 Dec 2017 08:56:40 +0000 (19:56 +1100)]
Merge branch release-2016 into release-2018
Change-Id: Icd7a26c399258adccf93b6d49b43567810072802
Berk Hess [Mon, 11 Dec 2017 21:03:53 +0000 (22:03 +0100)]
Avoid yelling about thread pinning twice
Do not print the second, generic note about not pinning threads
when we printed a first, more detailed warnings.
Updated the threadaffinity tests for this.
Fixes #2342
Change-Id: I1fa8374d70891c3bc1caf2827210875e1c1a020f
Roland Schulz [Tue, 12 Dec 2017 05:46:37 +0000 (21:46 -0800)]
Fix shift usage for KNC
9437181eacb removed the shift operator without
replacing the usage for KNC.
Change-Id: Ia5600c02d423cbb6cbaf730f7531f13bfe171132
Mark Abraham [Tue, 5 Dec 2017 11:52:10 +0000 (22:52 +1100)]
Second beta release of 2018
Updated release matrix because bs_nix-amd is unreliable
Change-Id: I34262535e7a9cffc21e83fff60570605fbe7955f
Erik Lindahl [Mon, 11 Dec 2017 16:47:22 +0000 (17:47 +0100)]
Remove duplicated lines from OPLS ffbonded.itp
Only identical lines have been removed, as identified
with sort ffbonded.itp | uniq -c | sort.
Fixes #1678.
Change-Id: Id6f2b335f385ed31ddec1662d2f446b139067d1a
Paul Bauer [Mon, 11 Dec 2017 16:02:39 +0000 (17:02 +0100)]
Fewer messages when -cpi file is not present
Removed duplicated message when -cpi restart is not found.
Fixes #2173
Change-Id: I6543c077dfc9660177e61dd2ba0018bf6804ed58
Roland Schulz [Wed, 29 Nov 2017 02:04:31 +0000 (18:04 -0800)]
Replace intrinsic with inline asm for AVX512 unit test
Without high optimization, some compilers (icc) produce
assembly that lead to lots of store-to-load forwarding
during initialization, which screws up timing results.
The modified code uses inline asm without loading from
memory, which is fine since the inline (volatile) asm
will not be optimized. Tested to work and detect 2 FMA
units on Core i9-7920X and 1 FMA on Xeon Silver 4116,
with with gcc-5.4, gcc-7.1, icc 2017 and clang-5 with
optimization levels from -O0 to -O3. We also avoid
warning if we override the architecture with the
AVX-512 flags for the source file containing the asm.
Fixes #2340.
Change-Id: I3aea95b162c55c7773182a69f639dff1a01d0603
Erik Lindahl [Sun, 10 Dec 2017 17:21:32 +0000 (18:21 +0100)]
Don't warn about NVML clocks that are at max
If the clocks are already maxed out there is no
point in echoing warnings about not being able
to set them.
Fixes #2313.
Change-Id: I77bc7111489166b580c2d7742a7729c003f25e9e
Mark Abraham [Mon, 11 Dec 2017 06:49:42 +0000 (17:49 +1100)]
Leave NVML use off by default
Even if NVML is found, leave the default off because the
linking is unreliable for reasons that are currently unclear,
and only in some cases is linking with NVML advantageous.
Fixes #2311
Change-Id: I03e833964995f88350bf6cb70c06f1e3f67bb865
Mark Abraham [Sun, 10 Dec 2017 07:35:52 +0000 (18:35 +1100)]
Check for GPU detection support before detecting
When a CUDA-enabled binary was run on a node with no CUDA driver
available, a note was issued that the version of the CUDA driver is
insufficient, which was wrong.
Fixed this by separating detection of a valid CUDA driver (or OpenCL
platform) from enumerating the compatible devices. This permits a
GPU-enabled build configuration to gracefully degrade to the same
behaviour as a CPU-only build configuration.
Also suppressed more warnings about use of OpenCL API elements that
have been deprecated but which we intend to contiune to use
regardless.
Also fixed confusing name of rank_local, and replaced it with a
boolean that cleanly describes the required functionality.
Also fixed and simplified logic of printing the GPU report. The
implementation only prints details about the node of the master rank,
so there is no value in checking a variable that reflects the number
of GPUs detected across all nodes.
Fixes #2322
Change-Id: I831d3c0017dafc00f7bb82e3f71be5b122657d1e
Paul Bauer [Wed, 6 Dec 2017 13:02:08 +0000 (14:02 +0100)]
Disallow combination of PME-user and verlet cutoff
Fixes #2332
Change-Id: I127a5680a0a83b7e5f8163b99619a6cc3729a992
Mark Abraham [Sun, 10 Dec 2017 12:07:18 +0000 (23:07 +1100)]
Consume any error produced during GPU detection
Having reported it, we should clear the CUDA error status so that
future calls do not continue to return it.
Fixes #2321
Change-Id: Id5c6445074b6b835296fcb544b7fc94168edc974
Mark Abraham [Mon, 11 Dec 2017 00:15:22 +0000 (01:15 +0100)]
Avoid signed-overflow warning with gcc 7
The code assumes no overflow will occur. Both the old and new code are
vulnerable to over/underflow for different extreme values of *n, which
are both undefined behaviour (for signed integers). So it is not
clear to me why this formulation keeps the compiler happy.
Otherwise, we get "cc1plus: warning: assuming signed overflow does not
occur when assuming that (X - c) <= X is always true
[-Wstrict-overflow]"
Change-Id: I354c547f5ae03e2bcf4485874aac454a8397d7b0
Mark Abraham [Mon, 11 Dec 2017 02:23:57 +0000 (03:23 +0100)]
Remove SIMD shift operators
These were almost unused, and caused problems in debug or clang builds
when the intrinsics required immediate operands that were not always
understood by the compiler to be available, because the function
argument was a variable. This could be fixed (see #2323), but there is
almost no known use for this functionality. For AVX-512, the fastMultiply
function is now implemented with explict intrinsics.
Fixes #2323
Change-Id: Ide45bde08deb425c18f35cd2adb263e566d643a1
Berk Hess [Sat, 9 Dec 2017 08:54:29 +0000 (09:54 +0100)]
Add grompp note for PR pcouple + position restraints
Refs #2330
Change-Id: I299e39247d69214d9b54c47874d522ac7b00a30e
Erik Lindahl [Sun, 10 Dec 2017 16:52:38 +0000 (17:52 +0100)]
Revert "Disable default-on NVML support in CMake"
This reverts commit
969fb1d08b7fd33c809ce967b9e3bc38d5ac83cd.
Sorry, my bad for not testing it before submitting. It won't work to
just define the option inside a conditional for whether the option is
set - then it will never be available in the CMake GUI.
Change-Id: Ia8310332932d6ce81c9249560744fdaa2648bee4
Berk Hess [Tue, 5 Dec 2017 10:58:03 +0000 (11:58 +0100)]
Remove SIMD warning for AMD Zen
After choosing nbnxn 2xNN kernels and changing the to tabulated Ewald
nonbonded kernels, AVX2_256 is only a few percent slower than AVX2_128
on AMD Zen and is faster with nonbondeds and PME on a GPU. So we
should not warn the user when AVX2_256 is used.
Refs #2328
Change-Id: I67b66b0025c7e3c31943f3f02b80e97fb9764066
Berk Hess [Tue, 5 Dec 2017 14:58:59 +0000 (15:58 +0100)]
Tone down note about not pinning threads
The note mdrun prints to stderr and log file when not pinning
threads is now a short note instead of the old failure message,
since not pinning is often a choice, not a failure. With more than
one thread, a failure or longer explanatory note will anyhow have
been issued before.
Also removed a warning about OpenMP set affinity when the total
number of threads was chosen by mdrun.
Updated the thread affinity tests for the new message. Added tests
that are now possible with the recently introduced auto thread count
variable.
Fixes #2088 and #2319
Change-Id: I1411a1ce6e222d22da8d70bf7bab2c9bb7564507
Szilárd Páll [Fri, 8 Dec 2017 19:04:06 +0000 (20:04 +0100)]
Disable default-on NVML support in CMake
Due to the problems related to NVML builds failing in link stage when
linking against stub libs, we disable NVML by default to protect users
from a hard to disagnose bug.
Refs #2311
Change-Id: Id083254bc4344fbb3a91e7dd645a5f814163d043
Mark Abraham [Fri, 8 Dec 2017 02:02:19 +0000 (13:02 +1100)]
Refine ulp tolerances for some more ewald tests
Fixes #2337
Change-Id: I86bb615fa147988c8e54f0bbd7e17c2f61b312d0
Berk Hess [Fri, 8 Dec 2017 14:33:06 +0000 (15:33 +0100)]
Correct Shake test tolerances
The shake test used a tolerance on the square of the distance instead
of the distance itself as the documentation says.
Added tolerance for rounding errors due to the absolute size of the
coordinate values involved.
Fixes #2338
Change-Id: I1a771c0682fd694b2672986b46c58d4888d5a4a2
Berk Hess [Fri, 8 Dec 2017 12:56:29 +0000 (13:56 +0100)]
Add Viveca Lindahl as contributor
Also fixed a type in a name of a contributor in the manual.
Change-Id: I62761a3aeb8aabc115592f23f844d9ae72afb108
Paul Bauer [Fri, 8 Dec 2017 10:57:24 +0000 (11:57 +0100)]
Add information regarding xlc compiler
Added information concerning that the xlc compiler
is neither supported nor tested.
Refs #2102
Change-Id: I1963a2fdaa6e27f4d9521c28088fc1c1f7eabe97
Erik Lindahl [Thu, 7 Dec 2017 13:28:17 +0000 (14:28 +0100)]
Added cool quote from Viveca's defense
... from her discussions with Bert de Groot.
Change-Id: Ifaa50cfc236d568b2c659b6cffc94b40ad577762
Roland Schulz [Wed, 6 Dec 2017 21:53:07 +0000 (13:53 -0800)]
Fix AWH test accuracy
Fixes #2334
Change-Id: Ieda604a3dbd1c253302214559e169581dfc02fe1
Berk Hess [Mon, 4 Dec 2017 22:12:30 +0000 (23:12 +0100)]
Choose faster nbnxn SIMD kernels on AMD Zen
On AMD Zen tabulated Ewald kernels are always faster than analytical.
And with AVX2_256 2xNN kernels are faster than 4xN.
These faster choices are now made based on CpuInfo at run time.
Refs #2328
Change-Id: I146bc012910bc1f46ed14155651c3d2a7c1f91e5
Roland Schulz [Thu, 7 Dec 2017 01:20:58 +0000 (17:20 -0800)]
Fix exp test for ICC in double
Fixes #2335
Change-Id: I5f688c64d59e8d2a23239fc945756bcc5130d15b
Roland Schulz [Thu, 7 Dec 2017 10:28:58 +0000 (02:28 -0800)]
Fix RelWithAssert with ICC
The fp-model flags required for ICC to run with fp-assertions
were not correctly appended. The forwarding of build configuration
specific flags was not working. The RelWithAssert for ICC flags
were the only such flags.
Change-Id: I138fd82eb758157f47ff5186a040afb9e4a2d956
Roland Schulz [Tue, 5 Dec 2017 20:41:18 +0000 (12:41 -0800)]
Fix C++11 library check's thread dependency
Test was depending on thread library without explicit linking it,
which some compiler require.
Fixes #2051
Change-Id: I9929d9373a4e8caa393f548d25a78bc03c016bf9
Aleksei Iupinov [Wed, 6 Dec 2017 09:58:43 +0000 (10:58 +0100)]
Relaxed precision in spline interpolation tests
The recent change I2babd8a7c5d4ab436e67a8b8d1ec0532a482ec94
removed the precision-dependent testing tolerances and
inadvertently tightened the double precision tolerance of
spline interpolation tests. This change relaxes the
spline values/derivatives tolerances for all precisions.
Change-Id: I4e97a86ba75f234cfa2ac2c4e644f52403578f1f
Roland Schulz [Wed, 6 Dec 2017 20:07:17 +0000 (12:07 -0800)]
Fix ICC warnings
Fixes #2300
Change-Id: Ia5ce081af805d905b7642d773fa51fb0b729a109
Aleksei Iupinov [Wed, 6 Dec 2017 11:42:45 +0000 (12:42 +0100)]
Provide dummy initialization to silence a warning
Either the real value is set in the loop below, or the program terminates.
Fixes #2329
Change-Id: Iefa89e1e1ad6d454befd461ea650e7f3bb79904c
Roland Schulz [Tue, 5 Dec 2017 23:29:42 +0000 (15:29 -0800)]
Added doxygen note to clarify rounding mode of SIMD
Also moved unit test of rounding mode into own test to make
obvious that rounding mode is independent of ldexp.
Fixes #2249
Change-Id: Iae66e080de141ace7f2b1298f94fdb867f8632fe
Szilárd Páll [Fri, 27 Oct 2017 16:24:36 +0000 (18:24 +0200)]
Implement alternating GPU wait
When both PME and nonbonded tasks are offloaded, instead of waiting in a
blocking call for each task in a predefined order, we poll the GPU
streams and start the reduction of the forces of the task that finishes
first. This allows overlapping one of the reductions with the GPU
compute/transfer of the task arriving second.
Change-Id: I612a0c5cae54bee04c1d587b98b6fc534e766de6
Alexey Shvetsov [Fri, 1 Dec 2017 14:12:36 +0000 (17:12 +0300)]
Fix build with cmake 3.10 on Linux
Without this fix cmake will fail with at least on Gentoo
Change-Id: Ie04c2e5f5884f05c3648fed7289157e73fd8d81f
Signed-off-by: Alexey Shvetsov <alexxy@omrb.pnpi.spb.ru>
Aleksei Iupinov [Tue, 5 Dec 2017 09:12:02 +0000 (10:12 +0100)]
Do not accept unsupported combination "-pme cpu -pmefft gpu"
Previously "-pmefft" was silently ignored in this case,
now this causes an error.
Change-Id: Ia72fc5845591a9ff349a110268bec87e5b87464b
Mark Abraham [Tue, 5 Dec 2017 09:27:21 +0000 (20:27 +1100)]
Fix mdrun -nb auto -pme auto when GPUs are absent
The logic was flawed such that GPUs were "selected" for use even
though none had been detected. That led to the GPU behaviour of
avoiding using separate PME ranks.
Also made a minor fix to the logic for emulation. The new
interpretation of mdrun -gpu_id does not need to trigger an error when
GPU IDs have been supplied along with the emulation environmnet
variable.
Fixes #2315
Change-Id: I68da27c9bfef9f73b9dae4f04f196066d2efb1e2
Roland Schulz [Tue, 5 Dec 2017 02:18:25 +0000 (18:18 -0800)]
Fix ArrayRef<SimdDInt32> for SSE/AVX128
Fixes #2326
Change-Id: I0af16deb658984fb9d9cbf0cfaf9926ff479f9af
Mark Abraham [Mon, 4 Dec 2017 02:43:42 +0000 (13:43 +1100)]
Update testing to include cmake 3.10.0
Change-Id: Ic5b9cd0e4e5a86ccc0576d8c7bb9acc27cdc3d4c
Magnus Lundborg [Wed, 29 Nov 2017 12:26:49 +0000 (13:26 +0100)]
SIMD accelerated Urey-Bradley
Enable SIMD for Urey-Bradley without energies and shift forces.
Change-Id: I8bb94ad4be6db41c0b3d8e820ceacc5c1005146e
Berk Hess [Sat, 2 Dec 2017 00:25:35 +0000 (01:25 +0100)]
Fix DD exact continuation bug
With domain decomposition the local atom density, used for setting
the search grid for sorting particles, was based on the local atom
count including atoms/charge groups that would be moved to
neighboring cells. This lead do a different density value, which in turn
could result in a different number of search grid cells and thus
a different summation order during a run versus when continuing a run
from checkpoint, when no atoms would be moved. Now exact continuation
is guaranteed for the domdec module with the mdrun -reprod option.
Refs #2318
Change-Id: I78452c7dfcf3ca6bdee63ece3795efc7e4ac467f
Berk Hess [Mon, 4 Dec 2017 15:24:11 +0000 (16:24 +0100)]
Fix pme gather in double with AVX(2)_128
The 4NSIMD PME gather change did not change the conditional
for grid alignment. This is made consistent here.
Note that the 4NSIMD change lowered the performance of PME gather
on AVX_128_FMA and AVX2_128 in double precision. We should consider
using 256-bit AVX for double precision instead.
Refs #2326
Change-Id: I07bfb3ca8d334bce18ed0b6989405bbc02c25b7b
Aleksei Iupinov [Wed, 29 Nov 2017 13:58:22 +0000 (14:58 +0100)]
Reformulate PME solving tests tolerances
Output complex grid tolerance is now expressed in terms of
B-spline moduli and SIMD exp() error.
Fixes #2306
Change-Id: Ie158597d4a4e798af38d57988278e89d408eb205
Berk Hess [Mon, 4 Dec 2017 12:38:52 +0000 (13:38 +0100)]
Merge "Merge branch release-2016 into release-2018" into release-2018
Mark Abraham [Mon, 4 Dec 2017 10:56:46 +0000 (21:56 +1100)]
Fix free_gpu
If a device context was not used, CUDA gives an error if we attempt to
clear it, so we must avoid clearing it.
Refs #2322
Change-Id: I67b8b2d263eaed9c7489a6de6f612b27496cc6c2
Berk Hess [Wed, 29 Nov 2017 16:43:18 +0000 (17:43 +0100)]
Fixed initial temperature reporting
Fixes #2314
Change-Id: I13dec05ede9b4ad976c22b4910ee02256dcaac74
Mark Abraham [Mon, 4 Dec 2017 10:15:15 +0000 (21:15 +1100)]
Fix unused variables in CPU pruning kernels
aj is unused when UNROLLJ != STRIDE, so should not be declared
Change-Id: I889e47a3d62ad5644a96c6d05403b4b9285975e4
Mark Abraham [Mon, 4 Dec 2017 09:18:21 +0000 (10:18 +0100)]
Merge branch release-2016 into release-2018
Change-Id: If84bee0ff744533c841532efbf29e6338c412c5a
Berk Hess [Mon, 4 Dec 2017 06:55:31 +0000 (07:55 +0100)]
Update mdrun signal help text
Updated mdrun help text on signal handling for old and recent changes
to the behavior.
Fixes #2324
Change-Id: I48dd30b7da3a1dc57331978c7d3b0e1509850187
Berk Hess [Sun, 3 Dec 2017 21:15:20 +0000 (22:15 +0100)]
Only stop at nstlist steps with -reprod
Stopping mdrun with two INT or TERM signals would always happen right
after the first global communication step. But this breaks exact
continuation. Now with mdrun -reprod a second signal will still stop
at a pair-list generation step, like with the first signal, so we can
still have exact continuation.
Refs #2318
Change-Id: If65c1215d2509d60c1c5a6444769e7809288e798
Erik Lindahl [Wed, 29 Nov 2017 21:58:38 +0000 (22:58 +0100)]
Fix compilation issues for AVX-512
- gcc-5.4.0 incorrectly requires the second argument of
_mm512_i32gather_pd() to be a double pointer instead
of void, but this should fix compilation for both
cases.
- Work around double precision permute instruction
only available with AVX512VL instructions.
Fixes #2312.
Change-Id: I31420e71064b1c5c25c8af29a1d41c7f372375c1
Berk Hess [Sat, 2 Dec 2017 21:37:29 +0000 (22:37 +0100)]
Clear vsite velocities for simple integrators
The simple integrator loops (introduced in
69470fc4) do not clear
the velocities of virtual sites. This allows velocities of virtual
sites to slowly increase over time. To prevent this, velocities
of virtual sites are now cleared in a separate loop.
Fixes #2316
Change-Id: I12ff0fae2cd3c45ad4e63bfeccfc8c88505cdb1e
Mark Abraham [Sun, 3 Dec 2017 12:30:01 +0000 (23:30 +1100)]
Fix fft5d pinning
A CUDA build on a node with no driver installed can never have
selected a CUDA pinning policy, and erroneously unpinning leads to a
fatal error. Instead, FFT5D now remembers whether it made pinning
possible, which can only occur when there was a driver and a valid
device, so that it can unpin only when appropriate.
Removed some C++ guards and named a variable more precisely.
Noted the a TODO to make a Jenkins configuration to test this code
path.
Fixes #2322
Change-Id: I50ae9cdeeb26ac0d0bd5ecf48b28b44cf0716745
Berk Hess [Fri, 1 Dec 2017 13:03:11 +0000 (14:03 +0100)]
Tighten B-spline moduli single precision test tolerance from 6 to 1 ULP
Also get rid of the misused double precision tolerance helper.
Change-Id: I2babd8a7c5d4ab436e67a8b8d1ec0532a482ec94
Berk Hess [Thu, 30 Nov 2017 15:17:58 +0000 (16:17 +0100)]
Avoid assertion failure in AWH
With an unstable reaction coordinate or unequilibrated system, AWH
could cause an assertion to fail. Now AWH checks for valid coordinate
input and throws an exception with a clear message.
Change-Id: I059d9cd9fbff74fc096a9c1e4c16cf8d84b2118a