Merge branch 'release-2018'

author Aleksei Iupinov <a.yupinov@gmail.com>

Tue, 27 Feb 2018 13:08:17 +0000 (14:08 +0100)

committer Aleksei Iupinov <a.yupinov@gmail.com>

Tue, 27 Feb 2018 14:05:53 +0000 (15:05 +0100)
author Aleksei Iupinov <a.yupinov@gmail.com>
Tue, 27 Feb 2018 13:08:17 +0000 (14:08 +0100)
committer Aleksei Iupinov <a.yupinov@gmail.com>
Tue, 27 Feb 2018 14:05:53 +0000 (15:05 +0100)
diff --git a/admin/builds/pre-submit-matrix.txt b/admin/builds/pre-submit-matrix.txt

index 3a4707b837468c868a49ae0a30365943484f572e..be93bcb7de5b488fbb2f12bdd05da07fb44a6725 100644 (file)
--- a/admin/builds/pre-submit-matrix.txt
+++ b/admin/builds/pre-submit-matrix.txt
@@ -27,7 +27,7 @@ gcc-4.8 gpu cuda-6.5 cmake-3.8.1 mpi npme=1 nranks=2 openmp
  # Test thread-MPI with CUDA
  # Test cmake version from before new FindCUDA support (in 3.8)
  # Test SIMD implementation of pair search for GPU code-path
-gcc-5 gpu cuda-8.0 thread-mpi openmp cmake-3.6.1 release-with-assert simd=avx2_256
+gcc-5 gpu cuda-9.0 thread-mpi openmp cmake-3.6.1 release-with-assert simd=avx2_256
  
  # Test newest cmake at time of release
  # Test with ThreadSanitizer (compiled without OpenMP, even though
@@ -80,7 +80,7 @@ icc-18 openmp opencl cuda-7.5 mpi release simd=avx2_256
  gcc-5 openmp simd=avx_128_fma opencl amdappsdk-3.0
  
  # TODO
-# Add support for CUDA 9.0
+# Add support for CUDA 9.1 + gcc 6
  # Add OpenMP support to ASAN build (but libomp.so in clang-4 reports leaks, so might need a suitable build or suppression)
  # Test hwloc support
  # Test newest supported LTS Ubuntu
diff --git a/docs/CMakeLists.txt b/docs/CMakeLists.txt

index e8114880a0fc60081dc7c8b081912d59619ed007..f18d56fd4a1cab6158aaef5a8ccb555d9b7e7135 100644 (file)
--- a/docs/CMakeLists.txt
+++ b/docs/CMakeLists.txt
@@ -135,6 +135,19 @@ if (SPHINX_FOUND)
          release-notes/2018/major/removed-features.rst
          release-notes/2018/major/portability.rst
          release-notes/2018/major/miscellaneous.rst
+        release-notes/2016/2016.5.rst
+        release-notes/2016/2016.4.rst
+        release-notes/2016/2016.3.rst
+        release-notes/2016/2016.2.rst
+        release-notes/2016/2016.1.rst
+        release-notes/2016/major/highlights.rst
+        release-notes/2016/major/new-features.rst
+        release-notes/2016/major/performance.rst
+        release-notes/2016/major/tools.rst
+        release-notes/2016/major/bugs-fixed.rst
+        release-notes/2016/major/removed-features.rst
+        release-notes/2016/major/miscellaneous.rst
+        release-notes/older/index.rst
          user-guide/index.rst
          user-guide/cutoff-schemes.rst
          user-guide/getting-started.rst
diff --git a/docs/release-notes/2016/2016.1.rst b/docs/release-notes/2016/2016.1.rst

new file mode 100644 (file)

index 0000000..aeade95
--- /dev/null
+++ b/docs/release-notes/2016/2016.1.rst
@@ -0,0 +1,173 @@
+GROMACS 2016.1 Release Notes
+----------------------------------------
+
+This version was released on October 28, 2016. These release notes
+document the changes that have taken place in GROMACS since the
+initial version 2016 to fix known issues. It also incorporates all
+fixes made in version 5.1.4.
+
+Made distance restraints work with threads and DD
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The NMR distance restraints use several buffers for summing distances
+that were indexed based on the index of the thread+domain local ilist
+force atoms. This gives incorrect results with OpenMP and/or domain
+decomposition. Using the type index for the restraint and a domain-
+local, but not thread-local index for the pair resolves these issues.
+The are now only two limitations left:
+
+* Time-averaged restraint don't work with DD.
+* Multiple copies of molecules in the same system without ensemble
+  averaging does not work with DD.
+
+Note that these fixes have not been made in any 5.1.x release.
+
+:issue:`1117`
+:issue:`1989`
+:issue:`2029`
+
+Fixed Ewald surface+3DC corrections
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Ewald surface and 3DC correction forces were only applied up to,
+but not including, the last atom with exclusions. With water at
+the end of the system only the last H would not be corrected.
+With ions at the end all ions would be missing.
+In addition, with the Verlet scheme and domain decomposition
+no force correction was applied at all.
+
+:issue:`2040`
+
+Fixed opening of wall table files
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+:issue:`2033`
+
+Fixed bug in gmx insert-molecules.
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+With option -ip, and if all trials were unsuccessful, a molecule was
+eventually incorrectly placed at 0/0/0 due to a memory error
+when referencing to rpos[XX][mol].
+
+Made virial reproducible
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+OpenMP reduction was used to reduce virial contributions over threads,
+which does not have a defined order. This leads to different rounding,
+which makes runs non-reproducible (but still fully correct).
+Now thread local buffers are used.
+Also removed OpenMP parallezation for small count (e.g. shift forces).
+
+Updated to support FFTW 3.3.5
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The auto-download of FFTW now gets FFTW 3.3.5 and builds it properly,
+including with ``--enable-vsx`` when GMX_SIMD is set to VSX, i.e. for
+Power8, and ``--enable-avx512`` when GMX_SIMD is any of the AVX flavours
+(which is safe on non-512 now, works on KNL, and is presumed useful
+for future AVX512 architectures).
+
+Permitted automatic load balancing to disable itself when it makes the run slower
+"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Under certain conditions, especially with (shared) GPUs, DLB can
+decrease the performance. We now measure the cycles per step before
+turning on DLB. When the running average of cycles per step with DLB
+gets above the average without DLB, we turn off DLB. We then measure
+again without DLB. If without DLB the cycle count is still lower,
+we keep DLB off for the remainder of the run. Otherwise is can turn
+on again as before. This procedure ensures that the performance will
+never deteriorate due to DLB.
+
+Improved the accuracy of timing for dynamic load balancing with GPUs
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+With OpenCL, the time for the local non-bonded to finish on the GPU
+was ignored in the dynamic load balancing. This change lets OpenCL
+take the same code path as CUDA.
+
+One internal heuristic parameter was far too small for both CUDA and
+OpenCL, which is now fixed.
+
+Corrected kernel launch bounds for Tesla P100 GPUs
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+This corrects our initial guess of kernel tuning parameters that resulted
+in reduced occupancy on sm_60 GPU, and thus improves performance.
+
+Improved logic handling if/when the run is terminated for SETTLE warnings
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The code now honours that when the environment variable
+GMX_MAXCONSTRWARN is set to -1, there is no maximum number of warnings.
+
+:issue:`2058`
+
+Fixed bug in gmx wham for reading pullx files.
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Because the order of columns in the pullx files has changed recently,
+``gmx wham`` did not pick the reaction coordinate from ``pullx.xvg``
+if the COM of the pull groups were written. ``gmx wham`` was tested
+with various pull options and geometries.
+
+Fixed ouput bug in gmx wham
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Fixed deadlock with thread-MPI
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+With thread-MPI mdrun could deadlock while pinning threads.
+
+:issue:`2025`
+
+Made error reporting in grompp more user friendly
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+This tool now always reports the file and line in user input files
+that lead to a condition such that subsequent parsing cannot continue.
+
+Fixed SIMD suggestion for VMX
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Fixed script xplor2gmx.pl to work with GMXDATA
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Fixed default nice level in mdrun-only build
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Now an mdrun-only build should default to zero nice level, the same as
+``gmx mdrun`` in a normal build.
+
+Fixed math-test false positive
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Depending on the accuracy of the floating point division, the
+input of the test function could be 1ulp too large or too small.
+If it was too large the result of the test function wasn't
+within 4ulp and the test failed.
+
+Improved documentation
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Approaches for reducing overhead for GPU runs are now documented.
+
+The available wallcycle counters and subcounters reported in the
+md.log files are now listed and and explained in the user guide, along
+with how to enable reporting of the subcounters.
+
+Several install-guide sections have been improved, including those for
+OpenCL, mdrun-only, and "make check". A "quick and dirty" cluster
+installation section was added.
+
+OpenCL error string are now written, instead of cryptic error codes
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Fixed build with GMX_USE_TNG=off
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Removed variable-precision .gro writing
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The precision used when writing .gro files is now fixed to 3, 4 and 5
+decimal places for x, v and box respectively to ensure compatibility with
+other software. Variable-precision reading is still supported.
+
+:issue:`2037`
+
+Fixed BG/Q platform files and install guide
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Renamed the platform file to reflect normal practice
+and the install guide.
+
+Reduced the memory required for free-energy simulations
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Pair lists with atoms whose short-ranged parameters are perturbed
+now use less memory.
+
+:issue:`2014`
diff --git a/docs/release-notes/2016/2016.2.rst b/docs/release-notes/2016/2016.2.rst

new file mode 100644 (file)

index 0000000..92b6585
--- /dev/null
+++ b/docs/release-notes/2016/2016.2.rst
@@ -0,0 +1,238 @@
+GROMACS 2016.2 Release Notes
+----------------------------------------
+
+This version was released on February 7, 2016. These release notes
+document the changes that have taken place in GROMACS since version
+2016.1 to fix known issues. It also incorporates all fixes made in
+version 5.1.4 and several since.
+
+Fixes where mdrun could behave incorrectly
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Add grompp check for equipartition violation risk for decoupled modes
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+When atoms involved in an angle with constrained bonds have very
+different masses, there can be very weakly coupled dynamics modes.
+Default mdp settings are often not sufficiently accurate to obtain
+equipartitioning. This change adds a grompp check for this issue.
+
+Part of :issue:`2071`
+
+Disallow overwriting of dihedral type 9
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+It is no longer allowed to repeat blocks of parameter of multiple
+lines for dihedrals of type 9. It is also no longer allowed to
+override parameters or dihedrals of type 9. Both are too complex
+to properly check. It is still allowed to repeat parameters blocks
+consisting of a single line.
+Repeating a block with the same parameters would lead to incorrect
+dihedral potentials and forces.
+
+:issue:`2077`
+
+Fixed flat-bottom position restraints + DD + OpenMP
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+A (re)allocation was missing, causing a crash.
+
+:issue:`2095`
+
+Fixed multi-domain reruns
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Old code cleanup led multi-domain rerun to crash because it failed to
+consider logic separated over two places.
+
+:issue:`2105`
+
+Fixes for mdrun performance issues
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Corrected CUDA sm_60 performance
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The kernel launch now suits the SM size of the GP100 architecture.
+
+Fixes for ``gmx`` tools
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Fixed some FFT handling in cross-corrrelation calculations
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+An array of complex number was created as an array of pointers and
+then passed to gmx_fft_1d. This does not work.
+
+:issue:`2109`
+
+Fixed gmx rmsf -q -oq
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+This led to the PDB file containing B-factors using coordinates based
+on those from the -s file, rather than -q file. gmx rmsf -oq was
+otherwise fine.
+
+Fixed crash in gmx order
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+gmx order used a cumbersome floating point method to compute
+a histogram, leading to an index value that could be negative.
+
+:issue:`2104`
+
+Fixed minor trjconv bug
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+gmx trjconv -novel -f in.pdb -o out.pdb probably works better now.
+
+Fixed time label print in gmx vanhove
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Handled issuing warnings correctly in xpm2ps and membed
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The code should not (over)write the output file before checking for
+errors. For membed, it is useful to require the user to fix issues in
+their input file before we unilaterally over-write it.
+
+Corrected documentation about eigenvalue handling
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Some file format docs were out of step with the implementation in
+eigio.cpp.
+
+The behaviour of gmx anaeig -eig -eig2 was not properly documented.
+
+Made editconf B-factor attachment more useful in practice
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+B-factor values will be added to residues unless an index is larger
+than the number of residues or an option is specified. Protein residue
+indices can start from any number and, in case they start from a large
+number, there is no way to add B-factor values to residues.
+
+This patch changes it to add B-factor values to residues unless the
+number of B-factor values is larger than the number of residues.
+
+Fixed possible memory error with long selections
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+If a selection was more than 1000 characters long and there was a
+whitespace exactly at the 1000 point, a buffer overflow could occur.
+Replaced the buffer with std::string, simplifying the code
+significantly.
+
+:issue:`2086`
+
+Fixed use of position variables with plus/merge
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+If a selection contained a position variable (e.g., 'com of ...') that
+was used more than once, and at least one of those uses was with
+plus/merge, there were out-of-bounds memory writes.  This was caused by
+the internal position structure not getting fully initialized.
+Incomplete initialization happens in all contexts with such variables,
+but only plus/merge (and possibly permute) actually use the values that
+remained uninitialized, which caused them to incorrectly compute the
+amount of memory required to store the result.
+
+:issue:`2086`
+
+Improved documentation
+^^^^^^^^^^^^^^^^^^^^^^
+
+Made several minor improvements to documentation and messages to users
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+In particular, for selections:
+
+- Explained resindex and resnr keywords in selection help.
+- Explained how selection-enabled tools treat -s and -f input files.
+
+:issue:`2083`
+
+Clarified use of tau-p and pcoupltype
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+grompp used to permit the erroneous "tau-p = 5 5". This does not
+reflect that only one time constant is permitted for pressure coupling
+(unlike group-based temperature coupling). The recent fix for
+:issue:`1893` leads to the user receiving a grompp warning, so this
+improves the docs to make clear that pressure coupling is different.
+
+:issue:`1893`
+
+Portability enhancements
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Fixed x86 conditional on IBM s390x
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The CpuInfoTest.SupportLevel test fails on IBM s390x because wrong
+condition was used.
+
+Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1390149
+
+:issue:`2072`
+
+Build system enhancements
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Fixed compilation with CMAKE_CXX_FLAGS="-Wall -Werror"
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+:issue:`2073`
+
+Stopped trying to use objdump --reloc in the build system on Mac
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Recent Xcode objdump does not support --reloc.
+
+The warning that is based on the output of running objdump was only
+implemented to work on Linux-like things, so should not spam the cmake
+output on other platforms.
+
+Improved the support for plugin loading in the build system
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The mdrun-only and prefer-static-libs builds set the default for
+BUILD_SHARED_LIBS to off, which silently disabled plugin support
+for things like VMD-based I/O handling.
+
+Converted GMX_LOAD_PLUGINS to tri-state ON/OFF/AUTO so that if the
+preconditions for support are not met we can have suitable behaviour
+in each case.
+
+:issue:`2082`
+
+Turn off hwloc support when static lib found
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Hwloc dependencies are not resolved at CMake time when static
+libwloc.a is detected and in most of these cases link-time
+errors will prevent building GROMACS. As it is hard for a user to know
+how to solve such cryptic errors and hwloc is not a required dependency,
+we turn off hwloc support when a static lib is detected. The user can
+override this on the cmake command line.
+
+:issue:`1919`
+
+Fixed build with GMX_EXTERNAL_TNG=ON
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+House-keeping that reduces users' problems
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Mdrun prints invalid performance data less often
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+If mdrun finished before a scheduled reset of the timing information
+(e.g. from mdrun -resetstep or mdrun -resethway), then misleading
+timing information should not be reported.
+
+Related, the default reset step for gmx tune_pme was increased to 1500.
+
+:issue:`2041`
+
+Added a runtime check for number of threads in bonded code
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Replaced a debug assertion on the number of OpenMP threads not being
+larger than GMX_OPENMP_MAX_THREADS by fatal error.
+But since the listed-forces reduction is actually not required with
+listed forces, these are now conditional and mdrun can run with more
+than GMX_OPENMP_MAX_THREADS threads.
+
+:issue:`2085`
+
+Fixed integer narrowing in TNG reading for long trajectories
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Reading of TNG trajectories with sufficiently large numbers of frames
+could truncate integers used for frame numbers. Fixed to use 64-bit
+integers as originally intended.
+
+Fixed logic of TRR reading
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+When reading a trr file, reaching the end of the file was
+indistinguishable from a reading error or a magic-number error. This
+is now fixed, restoring the intended behaviour in each case.
+
+:issue:`1926`
diff --git a/docs/release-notes/2016/2016.3.rst b/docs/release-notes/2016/2016.3.rst

new file mode 100644 (file)

index 0000000..be5cfee
--- /dev/null
+++ b/docs/release-notes/2016/2016.3.rst
@@ -0,0 +1,114 @@
+GROMACS 2016.3 Release Notes
+----------------------------
+
+This version was released on March 14, 2017. These release notes
+document the changes that have taken place in GROMACS since version
+2016.2 to fix known issues. It also incorporates all fixes made in
+version 5.1.4 and several since.
+
+Fixes where mdrun could behave incorrectly
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Fixed mdrun with separate PME ranks hanging upon exit
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+A recent fix for another issue led to mdrun hanging while communicating
+with PME ranks to coordinate end-of-run performance statistics.
+
+:issue:`2131`
+
+Fixed handling of previous virials with md-vv integrator
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+These quantities get written to checkpoint files only for the Trotter
+pressure-coupling integrators that need them, but they were being
+copied in do_md for all Trotter integrators. This meant that an
+appending restart of md-vv plus nose-hoover plus no pressure coupling
+truncated off a correct edr frame and wrote one with zero virial and
+wrong pressure. And in the same case, a no-append restart writes a
+duplicate frame that does not agree with the one written before
+termination.
+
+:issue:`1793`
+
+Fixed an incorrect check that nstlog != 0 for expanded ensembles
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The original version was accidentally reversed, causing it to
+fail when nstlog was not equal to 0.
+
+Fixes for ``gmx`` tools
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Fixed ``gmx tune_pme`` detection of GPU support
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Fixed spacing in ``gmx tune_pme`` call to thread-MPI mdrun
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Fixed minor issues in ``gmx traj -av -af``
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Made the description of the xvg y-axis more useful. Also works for
+option ``-af``.
+
+:issue:`2133`
+
+Removed rogue printing to xvg file in gmx mindist
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+``gmx mindist -xvg`` none is now adhered to, and printing is preceded by
+a comment.
+
+:issue:`2129`
+
+Fixed bug in ``gmx solvate -shell`` if it yielded 0 SOL.
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+In the transition from genbox to solvate, some incorrect logic was
+introduced.
+
+:issue:`2119`
+
+Corrected output of ``gmx do_dssp -sc``
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+This code has always written a probability, and not a percentage, so
+fixed the label. It still fits within the expected 8-character field.
+
+:issue:`2120`
+
+Improved documentation
+^^^^^^^^^^^^^^^^^^^^^^
+
+Made several minor improvements to documentation and messages to users
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Removed documentation of unimplemented ``gmx trjconv -clustercenter``.
+
+Introduced system preparation section to user guide, to create
+somewhere to document the use and limitations of vdwradii.dat.
+Enchanced documentation of solvate and insert-molecules, similarly.
+
+:issue:`2094`
+
+Documented that we now support AMD GCN on Mesa/LLVM
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+AMD GPUs using Mesa 17.0+ and LLVM 4.0+ run GROMACS using OpenCL.
+
+Documented running Clang static analyzer manually
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Portability enhancements
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Enabled avx512 in the GROMACS FFTW build only if the compiler supports it
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Enabling avx512 requires GCC 4.9 or newer or Clang 3.9 or newer. Since
+we support compilers older than those, we can not afford to enable
+avx512 in ``GMX_BUILD_OWN_FFTW=on`` unconditionally.
+
+Worked around false positives in SIMD test from bug in xlc 13.1.5
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+atan2(0,0) should return 0.0, which the GROMACS simd implementation
+does. However, since at least one compiler produces -nan for the
+standard library version it's better to compare with the known
+correct value rather than calling std:atan2(0,0).
+
+:issue:`2102`
+
+Fixed compile with icc of ``GMX_SIMD=None``
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+ICC defines invsqrt in math.h
diff --git a/docs/release-notes/2016/2016.4.rst b/docs/release-notes/2016/2016.4.rst

new file mode 100644 (file)

index 0000000..1a05cf2
--- /dev/null
+++ b/docs/release-notes/2016/2016.4.rst
@@ -0,0 +1,298 @@
+GROMACS 2016.4 Release Notes
+----------------------------
+
+This version was released on September 15, 2017. These release notes
+document the changes that have taken place in GROMACS since version
+2016.3 to fix known issues. It also incorporates all fixes made in
+version 5.1.4 and several since.
+
+Fixes where mdrun could behave incorrectly
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Disabled PME tuning with the group scheme
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+PME tuning with the group cut-off scheme did not work correctly.
+Interactions between charge-group pairs at distances between ``rlist``
+and ``rcoulomb`` can go missing. The group scheme is deprecated, and
+this issue would require considerable effort to fix and test, so we
+have simply disabled PME tuning with the group scheme.
+
+:issue:`2200`
+
+Fixed value of Ewald shift
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+In all the Ewald short-ranged kernel flavours, the value of the
+potential at the cutoff is subtracted from the potential at the actual
+distance, which was done incorrectly (failing to divide the shift
+value by cutoff distance). Fortunately, the value of that distance is
+often close to 1, and the inconsistent shifts often cancel in
+practice, and energy differences computed on neighbour lists of the
+same size will have the error cancel. The difference doesn't even show
+up in the regressiontests, but would if we had a unit test of a single
+interaction.
+
+:issue:`2215`
+
+Fixed orientation restraint reference
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The resetting of the COM of the molecule with orientation restraints
+for fitting to the reference structure was done with the COM of the
+reference structure instead of the instantaneous structure. This does
+not affect the restraining (unless ensemble averaging is used), only
+the printed orientation tensor.
+
+:issue:`2219`
+
+Fixed bugs with setup for orientation restraints
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The orientation restraint initialization got moved to before the
+initialization of the domain decomposition, which made the check
+for domain decomposition fail.
+Also fixed orientation restraints not working with the whole system
+as fitting group.
+
+Worked around missing OpenMP implementation in orientation restraints
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The orientation restraint code is not aware of OpenMP threads
+and uses some global information. By only running it on the
+master rank, results are now independent of number of threads
+used.
+
+:issue:`2223`
+
+Enable group-scheme SIMD kernels on recent AVX extensions
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The group-scheme code only runs using the feature set of AVX_256, but
+that is supported on the more recent hardware, so we should have the
+group scheme run with the maximum suitable SIMD. With previous releases,
+building AVX_256 binaries was required for best performance with the
+(deprecated) group scheme.
+
+Fix FEP state with rerun
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+When using FEP states with rerun, the FEP state was always 0.
+
+:issue:`2244`
+
+Fixed COM pull force with SD
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The reported COM pull force when using the SD integrator was random
+only. Now the pull force is summed over the systematic and random SD
+update components.  A better solution is to not add the random force
+at all, but such a change should not be done in a release branch.
+
+:issue:`2201`
+
+Fix PBC bugs in the swap code
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+:issue:`2245`
+
+Fixed flat-bottomed position restraints with multiple ranks
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Reallocation was never done for flat-bottomed restraints, during
+domain decomposition, so the indexing could go out of range, leading
+to segfaults.
+
+:issue:`2236`
+
+Fixed null pointer print in DD
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Fixed a (rather harmless) print of a null pointer string during
+DD initialization. This would only show up with ``gmx mdrun -dlb yes``.
+
+Improved the "files not present" error message
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+It's possible to use ``gmx mdrun -deffnm`` in restarts even if it
+wasn't used in the initial simulation. This can lead to absurd
+situations such as:
+
+  Expected output files not present or named differently:
+    pullx.xvg
+    pullf.xvg
+
+where ``pullx.xvg`` and ``pullf.xvg`` are present and named exactly as
+listed, but GROMACS expects them to be named as ``-deffnm`` requested.
+
+The improved error message suggest to the user to check for that
+possibility.
+
+:issue:`942` (partial workaround)
+
+Fixed LJ-PME + switch grompp error
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+An error call was missing in grompp when LJ-PME was requested in
+combination with a force or potential switch modifier.
+
+:issue:`2174`
+
+Fixed unused SIMD PME table kernel
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The Verlet-scheme 2xNN PME kernel with tabulated correction had
+several issues. This kernel flavor could only be selected manually by
+setting an environment variable, so no user simulations should be
+affected.
+
+:issue:`2247`
+
+Fixed bugs in most double-precision Simd4 implementations
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The double precision version of reduce() and dotProduct() returned a
+float with AVX2_256, AVX_256, AVX_128_FMA, AVX_512, MIC and IBM_QPX.
+Only reduce() is used in double, in the PME force gather, and the
+difference is small.
+
+:issue:`2162`
+
+Avoid inf in SIMD double sqrt()
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Arguments > 0 and < float_min to double precision SIMD sqrt()
+would produce inf on many SIMD architectures. Now sqrt() will
+return 0 for arguments in this range, which is not fully correct,
+but should be unproblematic.
+
+:issue:`2164`
+:issue:`2163`
+
+Fix NVML error messages
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+These should refer to the API calls that failed, e.g. when users lack
+permissions to change clocks.
+
+Fixed IMD interface malfunctions
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+:issue:`2206`
+
+Fixed initial temperature reporting
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+When continuing a simulation from a checkpoint, mdrun could report
+double the intial temperature when ``nstcalcenergy=1`` or ``nsttcoupl=1``.
+Note that this only affected reporting, the actual velocities were
+correct.
+Now the initial temperature is no longer reported for continuation
+runs, since at continuation there is no "initial" temperature.
+
+:issue:`2199`
+
+Fix exception in SIMD LJ PME solve
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Clear SIMD padding elements in solve helper arrays to avoid,
+otherwise harmles, fp overflow exceptions.
+
+:issue:`2242`
+
+Fixes for ``gmx`` tools
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Fixed memory access issues in gmx solvate
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+There was out-of-bounds access if
+ 1) the solvent configuration was given as a .pdb file, or
+ 2) there was more than one type of residue in the solvent (which
+    triggered sorting).
+
+Also fix a memory leak in the sorting routine.
+
+Should fix crashes mentioned in :issue:`2148`
+
+Fixed a consistency check in ``gmx make_edi`` for flooding
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+If one sets up a flooding .edi input file with ``gmx make_edi``,
+the code should check that one does not use of the last 6 eigenvectors
+of the covariance matrix, which correspond to the rotational and
+translational degrees of freedom.
+The check that was in the code erroneously checked against the
+number of eigenvalues neig that was stored in the .xvg file,
+not against the total number of eigenvectors which depends on
+the number of atoms nav used in gmx covar. Thus the original
+check would always fail if the .xvg eigenvalue file contained
+1-6 values only.
+
+Supported quiet trajectory-handling I/O
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Permits ``GMX_TRAJECTORY_IO_VERBOSITY=0`` to be set to keep frame-reading
+code quiet, which is convenient for tools using libgromacs.
+
+Improved documentation
+^^^^^^^^^^^^^^^^^^^^^^
+
+Migrated much content from the wiki to the user guide
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+This includes
+* expanding the "Performance" section,
+* reworking extending simulations, doing restarts and reproducibility,
+* adding documentation for mdp option ``simulation-part``.
+* adding documentation for issues relating to floating-point arithmetic
+* adding documentation for run-time errors
+
+Corrected the PDF manual to reflect that all tools are called ``gmx <tool>``
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+There were still a few occurrences of the old-style ``g_tool`` naming,
+this patch removes. Deliberately left ``g_membed`` as is, because there
+was never a ``gmx membed``, but instead it got incorporated into
+``gmx mdrun``.
+
+Clarified ``gmx editconf`` help text
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+It is possible that users can confuse ``-c`` with ``-center`` so this
+patch makes it clear that ``-center`` doesn't do anything unless the
+user really wants to shift the center of the system away from the
+middle of the box.
+
+:issue:`2171`
+
+Added missing .mdp file documentation for the enforced rotation module
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Fixed parameter description for dihedral_restraints
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The force-constant parameter for dihedral_restraints was not
+documented in the table of interaction types.
+
+:issue:`2144`
+
+Replaced instance of "group" by "coord" in pull .mdp documentation
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Portability enhancements
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Supported CUDA 9/Volta for nonbonded kernels
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Implemented production-quality support for Volta GPUs and CUDA 9.
+
+The code was adapted to support changes to the nature of warp
+synchrony, without disturbing support for older GPUs and/or
+CUDA. Further improvements may be seen (e.g. in the 2017 release).
+
+Really enabled AVX512 in the GROMACS-managed build of FFTW
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+An earlier attempt to enable AVX512 on GCC 4.9 or newer and
+Clang 3.9 or newer was wrongly implemented. Now this works on
+all compilers we officially support (MSVC, GCC, clang, ICC).
+
+Fixed aspects for compiling and running on Solaris
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Fixed AVX512F compiler flags
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Avoid using the MIC code generation flags for the Xeon code path.
+
+Fixed compiler flags for using MKL
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Fixes compilation issues with ARM SIMD
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+ARM_NEON has never supported double precision SIMD, so disabled it
+with GROMACS double-precision build.
+
+The maskzR* functions used the wrong argument order in the debug-mode
+pre-masking (and sometimes in a typo-ed syntax).
+
+In the shift operators, the clang-based compilers (including the
+armclang v6 compiler series) seem to check that the required immediate
+integer argument is given before inlining the call to the operator
+function. The inlining seems to permit gcc to recognize that the
+callers always use an immediate. In theory, the new code might
+generate code that runs a trifle slower, but we don't use it at the
+moment and the cost might be negligible if other effects dominate
+performance.
diff --git a/docs/release-notes/2016/2016.5.rst b/docs/release-notes/2016/2016.5.rst

new file mode 100644 (file)

index 0000000..8f031fc
--- /dev/null
+++ b/docs/release-notes/2016/2016.5.rst
@@ -0,0 +1,153 @@
+GROMACS 2016.5 Release Notes
+----------------------------
+
+This version was released on February 16, 2018. These release notes
+document the changes that have taken place in GROMACS since version
+2016.4 to fix known issues. It also incorporates all fixes made in
+version 5.1.5 (which was the last planned release in that series).
+
+Fixes where mdrun could behave incorrectly
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Fixed triclinic domain decomposition bug
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+With triclinic unit-cells with vectors a,b,c, the domain decomposition
+would communicate an incorrect halo along dimension x when b[x]!=0
+and vector c not parallel to the z-axis. The halo cut-off bound plane
+was tilted incorrect along x/z with an error approximately
+proportional to b[x]*(c[x] - b[x]*c[y]/b[y]).
+When c[x] > b[x]*c[y]/b[y], the communicated halo was too small, which
+could cause instabilities or silent errors.
+When c[x] < b[x]*c[y]/b[y], the communicated halo was too large, which
+could cause some communication overhead.
+
+:issue:`2125`
+
+Required -ntmpi with setting -ntomp with GPUs
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+With GPUs and thread-MPI, setting only ``-ntomp`` could lead to
+oversubscription of the hardware threads.
+Now with GPUs and thread-MPI the user is required to set ``-ntmpi`` when
+using ``-ntomp``. Here we chose that to also require ``-ntmpi`` when the user
+specified both ``-nt`` and ``-ntomp``; here we could infer the number of
+ranks, but it's safer to ask the user to explicity set ``-ntmpi``.
+Note that specifying both ``-ntmpi`` and ``-nt`` has always worked correctly.
+
+:issue:`2348`
+
+Prevented dynamic load balancing activating immediately after exchange
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Turning on DLB right after exchanging replicas caused an assertion
+failure and is also useless.
+
+:issue:`2298`
+
+Avoided confusing message at end of non-dynamical runs
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+EM, TPI, NM, etc. are not targets for performance optimization
+so we will not write performance reports. This commit fixes
+and oversight whereby we would warn a user when the lack of
+performance report is normal and expected.
+
+:issue:`2172`
+
+Changed to issue fewer messages when ``-cpi`` checkpoint file is not present
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Removed duplicated message.
+
+:issue:`2173`
+
+Disallowed combination of PME-user and Verlet cutoff
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+:issue:`2332`
+
+Added missing Ewald correction for pme-user
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+With ``coulomb-type = pme-user``, the Ewald mesh energy was not subtracted
+leading to (very) incorrect Coulomb energies and forces.
+
+:issue:`2286`
+
+Fixed thread-MPI rank choice for orientation restraints
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Only a single rank is supported, so that must be what the thread-MPI
+code will choose. There's another check later on that catches the
+multi-rank MPI case.
+
+Fixed nstlist increase warning print
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The log file warning message had a buggy conditional which this commit
+fixes.
+
+Removed incorrect comment for CHARMM tips3p
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Fixed incorrect dV/dlambda for walls
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The free-energy derivative dV/dlambda for walls, which can
+be perturbed by changing atom types of non-wall atoms, only
+contained the B-state contribution.
+
+:issue:`2267`
+
+Fixed warning for confout with periodic molecules
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+With periodic molecules, ``gmx mdrun`` would incorrectly attempt to make
+molecules whole for writing the final state to confout.
+
+:issue:`2275`
+
+Fixed wrong megaflop accounting
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Some nrnb index entries were missing in the interaction_function
+array, leading to that wrong megaflops accounting printed.
+
+:issue:`2274`
+
+Fixes for ``gmx`` tools
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Fixed ``gmx grompp`` net charge check
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The grompp check for the net charge would ignore molecule blocks
+at the end when molecule types are used in multiple, non consecutive
+molecule blocks.
+
+:issue:`2407`
+
+Extended ``gmx grompp`` missing energy term message
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+:issue:`2301`
+
+Fixed ``gmx genion`` charge summation accuracy
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+``gmx genion`` accumulated the charge is a float, which could cause
+underestimation of the net charge for highly charged systems.
+
+:issue:`2290`
+
+Fixed ``gmx check`` for tprs with different #atoms
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+:issue:`2279`
+
+Fixed ``gmx grompp`` with Andersen massive and no COM removal
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Fixed a floating point exception leading to a crash.
+Also fixed possible different rounding for the interval for
+Andersen massive in ``gmx grompp`` from ``gmx mdrun`` for
+the common case where ``tau_t`` is a multiple of ``delta_t``.
+
+:issue:`2256`
+
+Improved documentation
+^^^^^^^^^^^^^^^^^^^^^^
+
+Updated documention of Nose-Hoover output
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The documentation of Nose-Hoover chain variable printing was
+(long) outdated.
+
+:issue:`2301`
+
+Clarified docs for Fmax in EM
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
diff --git a/docs/release-notes/2016/major/bugs-fixed.rst b/docs/release-notes/2016/major/bugs-fixed.rst

new file mode 100644 (file)

index 0000000..64e2525
--- /dev/null
+++ b/docs/release-notes/2016/major/bugs-fixed.rst
@@ -0,0 +1,285 @@
+Bugs fixed
+^^^^^^^^^^
+
+These document fixes for issues that have been fixed for the 2016
+release, but which have not been back-ported to other branches.
+
+Fixed two problems related to restarts for velocity-Verlet
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The first problem is more serious; in addition to causing problems
+with restarts in most cases for velocity-Verlet integrators plus either
+Berendsen or v-rescale temperature-coupling algorithms, the
+temperature coupling code was called twice. This made the distribution of
+kinetic energies too broad (but with the correct average).
+Other algorithm combinations were unaffected.
+
+In the second problem, the initial step after restarts with velocity-Verlet
+integrators and either Berendsen or v-rescale temperature-coupling algorithms
+had too high a pressure because they used an empty virial matrix that
+was only filled with MTTK pressure control. The effects of this bug were
+very small; it only affected the volume integration for one step on restarts.
+
+:issue:`1883`
+
+Fixed Verlet buffer calculation with nstlist=1
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Under rare circumstances the Verlet buffer calculation code was
+called with nstlist=1, which caused a division by zero. The division
+by zero is now avoided.
+Furthermore, grompp now also determines and prints the Verlet buffer
+sizes with nstlist=1, which provider the user information and adds
+consistency checks.
+
+:issue:`1993`
+
+Fixed large file issue on 32-bit platforms
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+At some point gcc started to issue a warning instead of a fatal error
+for the checking code; fixed to really generate an error now.
+
+:issue:`1834`
+
+Avoided using abort() for fatal errors
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+This avoids situations that produce useless core dumps.
+
+:issue:`1866`
+
+Fixed possible division by zero in polarization code
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Avoided numerical overflow with overlapping atoms in Verlet scheme
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The Verlet-scheme kernels did not allow overlapping atoms, even if
+they were not interacting (in contrast to the group kernels). Fixed by
+clamping the interaction distance so it can not become smaller than
+~6e-4 in single and ~1e-18 in double, and when this number is later
+multiplied by zero parameters it will not influence forces. The
+clamping should never affect normal interactions; mdrun would
+previously crash for distances that were this small.  On Haswell, RF
+and PME kernels get 3% and 1% slower, respectively.  On CUDA, RF and
+PME kernels get 1% and 2% faster, respectively.
+
+:issue:`1958`
+
+Relax pull PBC check
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The check in the pull code for COM distances close to half the box
+was too strict for directional pulling. Now dimensions orthogonal
+to the pull vector are no longer checked. (The check was actually
+not strict enough for directional pulling along x or y in triclinic
+units cells, but that is a corner case.)
+Furthermore, the direction-periodic hint is now only printed with
+geometry direction.
+
+:issue:`1962`
+
+Add detection for ARMv7 cycle counter support
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+ARMv7 requires special kernel settings to allow cycle
+counters to be read. This change adds a cmake setting
+to enable/disable counters. On all architectures but ARMv7
+it is enabled by default, and on ARMv7 we run a small test
+program to see if the can be executed successfully. When
+cross-compiling to ARMv7 counters will be disabled, but
+either choice can be overridden by setting a value for
+GMX_CYCLECOUNTERS in cmake.
+
+:issue:`1933`
+
+Introduced fatal error for too few frames in ``gmx dos``
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+To prevent ``gmx dos`` from crashing with an incomprehensible error
+message when there are too few frames, test for this.
+
+Part of :issue:`1813`
+
+Properly reset CUDA application clocks
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+We now store the application clock values we read when starting mdrun
+and reset to these values, but only when clocks have not been changed
+(by another process) in the meantime.
+
+:issue:`1846`
+
+Fixed replica-exchange debug output to all go to the debug file
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+When ``mdrun -debug`` was selected with replica exchange, some of the
+order description was printed to mdrun's log file, but it looks like the
+actual numbers were being printed to the debug log. This puts them
+both in the debug log.
+
+Fixed gmx mdrun -membed to always run on a single rank
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+This used to give a fatal error if default thread-MPI mdrun had chosen
+more than one rank, but it will now correctly choose to use a single rank.
+
+Fixed issues with using int for number of simulation steps
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Mostly we use a 64-bit integer, but we messed up a few
+things.
+
+During mdrun -rerun, edr writing complained about the negative step
+number, implied it might be working around it, and threatened to
+crash, which it can't do. Silenced the complaint during writing,
+and reduced the scope of the message when reading.
+
+Fixed TNG wrapper routines to pass a 64-bit integer like they should.
+
+Made various infrastructure use gmx_int64_t for consistency, and noted
+where in a few places the practical range of the value stored in such
+a type is likely to be smaller. We can't extend the definition of XTC
+or TRR, so there is no proper solution available. TNG is already good,
+though.
+
+:issue:`2006`
+
+Fixed trr magic-number reading
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The trr header-reading routine returned an "OK" value even if the
+magic number was wrong, which might lead to chaotic results
+everywhere.  This led to problems if other code (e.g. cpptraj)
+mistakenly wrote a wrong-endian trr file, which was then used with
+GROMACS. (This should never be a thing for XDR files, which are
+defined to be big endian, but such code has existed.)
+
+:issue:`1926`
+
+Changed to use only legal characters in OpenCL cache filename
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The option to cache JIT-compiled OpenCL short-ranged kernels needed to
+be hardened, so that mdrun would write files whose names would usually
+be specific to the device, but also only contain filenames that would
+work everywhere, ie only alphanumeric characters from the current
+locale.
+
+Fixes for bugs introduced during development
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+These document fixes for issues that were identified as having been
+introduced into the release-2016 branch since it diverged from
+release-5-1. These will not appear in the final release notes, because
+no formal release is thought to have had the problem. Of course, the
+Redmine issues remain available should further discussion arise.
+
+Fixed bug in v-rescale thermostat & replica exchange
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Commit 2d0247f6 made random numbers for the v-rescale thermostat that
+did not vary over MD steps, and similarly the replica-exchange random
+number generator was being reset in the wrong place.
+
+:issue:`1968`
+
+Fixed vsite bug with MPI+OpenMP
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The recent commit b7e4f30d caused non-local virtual sites not be
+treated when using OpenMP. This means their coordinates lagged one
+step behind and their forces are not spread to the atoms, leading
+to small errors in the forces. Note that non-local virtual sites are
+only used when local virtual sites use them as a constructing atom;
+the most common case is a C/N in a CH3/NH3 group with vsite H's.
+Also added a check on the vsite count for debug builds.
+
+:issue:`1981`
+
+Fixed some thread affinity cases
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Fixed one deadlock in newly refactored thread-affinity code, which
+happened with automatic pinning, if only part of the nodes were full.
+
+There is one deadlock still theoretically possible: if thread-MPI
+reports that setting the affinity is not possible only on a subset of
+ranks, the code deadlocks.  This has always been there and might never
+happen, so it is not fixed here.
+
+Removed OpenMP overhead at high parallelization
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Commit 6d98622d introduced OpenMP parallelization for for loops
+clearing rvecs of increasing rvecs. For small numbers of atoms per
+MPI rank this can increase the cost of the loop by up to a factor 10.
+This change disables OpenMP parallelization at low atom count.
+
+Removed std::thread::hardware_concurrency()
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+We should not use std::thread::hardware_concurrency() for determining
+the logical processor count, since it only provides a hint.
+Note that we still have 3 different sources for this count left.
+
+Added support for linking against external TinyXML-2
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+This permits convenient packaging of GROMACS by distributions, but
+it got lost from gerrit while rebasing.
+
+:issue:`1956`
+
+Fixed data race in hwinfo with thread-MPI
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+:issue:`1983`
+
+Fixes for Power7 big-endian
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Now compiles and passes all tests in both double and single precision
+with gcc 4.9.3, 5.4.0 and 6.1.0 for big-endian VSX.
+
+The change for the code in incrStoreU and decrStoreU addresses an
+apparent regression in 6.1.0, where the compiler thinks the type
+returned by vec_extract is a pointer-to-float, but my attempts a
+reduced test case haven't reproduced the issue.
+
+Added some test cases that might hit more endianness cases in future.
+
+We have not been able to test this on little-endian Power8; there is
+a risk the gcc-specific permutations could be endian-sensitive. We'll
+test this when we have hardware access, or if somebody runs the tests
+for us.
+
+:issue:`1997`
+:issue:`1988`
+
+Reduce hwloc & cpuid test requirements
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+On some non-x86 linux platforms hwloc does not report
+caches, which means it will fail our strict test
+requirements of full topology support. There is no
+problem whatsoever with this, so we reduce the
+test to only require basic support from hwloc - this
+is still better than anything we can get ourselves.
+Similarly for CPUID, it is not an error for an
+architecture to not provide any of the specific flags
+we have defined, so avoid marking it as such.
+
+:issue:`1987`
+
+Work around compilation issue with random test on 32-bit machines
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+gcc 4.8.4 running on 32-bit Linux fails a few
+tests for random distributions. This seems
+to be caused by the compiler doing something
+strange (that can lead to differences in the lsb)
+when we do not use the result as floating-point
+values, but rather do exact binary comparisions.
+This is valid C++, and bad behaviour of the
+compiler (IMHO), but technically it is not required
+to produce bitwise identical results at high
+optimization. However, by using floating-point
+tests with zero ULP tolerance the problem
+appears to go away.
+
+:issue:`1986`
+
+Updated ``gmx wham`` for the new pull setup
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+This bring ``gmx wham`` up to date with the new pull setup where the pull
+type and geometry can now be set per coordinate and the pull
+coordinate has changed and is more configurable.
+
+Fix membed with partial revert of 29943f
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The membrane embedding algorithm must be initialized before
+we call init_forcerec(), so it cannot trivially be moved into
+do_md(). This has to be cleaned up anyway for release-2017
+since we will remove the group scheme be then, but for now
+this fix will allow us have the method working in release-2016.
+
+:issue:`1998`
diff --git a/docs/release-notes/2016/major/highlights.rst b/docs/release-notes/2016/major/highlights.rst

new file mode 100644 (file)

index 0000000..6b136c3
--- /dev/null
+++ b/docs/release-notes/2016/major/highlights.rst
@@ -0,0 +1,40 @@
+Highlights
+^^^^^^^^^^
+
+|Gromacs| 2016 was released on August 4, 2016. Patch releases
+have been made since then, please use the updated versions!  Here are
+some highlights of what you can expect, along with more detail in the
+links below!
+
+* As always, we've got several useful performance improvements, with or
+  without GPUs. CPU-side SIMD and threading enhancements will
+  make GPU-accelerated simulations faster even if we'd left the GPU
+  code alone! Thanks to these and additional GPU kernel improvements,
+  in GPU-accelerated runs expect around 15% improvement
+  in throughput. (And not just for plain vanilla MD, either... the
+  pull code now supports OpenMP threading throughout, and
+  multi-simulations have less coupling between simulations.)
+* We have a new C++11 portability layer permitting us to accelerate in
+  SIMD on the CPU lots of minor routines. These will also often
+  improve runs that use accelerators or many nodes through better load
+  balancing. POWER8, ARM64, AVX512 (KNL), and more are fully SIMD accelerated now
+  because they are supported in the new portability layer!
+* We made further SIMD acceleration of bonded interactions which
+  reduces their calculation time by about a factor of 2. This improves
+  load balance at high parallelization by a factor of 2, and shows
+  significantly better scaling.
+* Similarly, SIMD acceleration of SETTLE reduces the time for
+  constraints by a factor of 3 to 5 - which has a strong effect for GPU runs.
+* OpenCL GPU support is now available with all combinations of MPI,
+  thread-MPI and GPU sharing (ie. the same as CUDA). Kernel performance
+  has improved by up to 60%. AMD GPUs benefit the most, OpenCL on NVIDIA is
+  generally still slow.
+* Tools in the new analysis framework can handle trajectories that
+  are subsets of the simulation system.
+* New pull coordinate geometries angle-axis, dihedral, and normal angle.
+* Checkpoint restarts work only in the cases where the implementation
+  can always do what the user wants.
+* The version numbering has changed to be the year of the release,
+  plus (in future) a patch number. GROMACS 2016 will be the initial
+  release from this branch, then GROMACS 2016.1 will have the set of
+  bugs that have been fixed in GROMACS 2016, etc.
diff --git a/docs/release-notes/2016/major/miscellaneous.rst b/docs/release-notes/2016/major/miscellaneous.rst

new file mode 100644 (file)

index 0000000..6477065
--- /dev/null
+++ b/docs/release-notes/2016/major/miscellaneous.rst
@@ -0,0 +1,136 @@
+Miscellaneous
+^^^^^^^^^^^^^
+
+Various improvements to documentation and tests
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+In particular, the definition of pressure in the reference manual
+should be in bar, and a spurious r_ij in the force for the Morse
+potential was removed. Added documentation and literature references
+for membrane embedding. Improved template analysis program
+documentation. gmock was patched to work with gcc 6.
+
+:issue:`1932`
+
+Improved make_ndx help text
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Clarified the use of boolean operators. The old help text could
+incorrectly hint that AND, OR, and NOT would work as keywords.
+Added a reference to ``gmx select`` that in most cases can serve as a
+replacement.
+
+:issue:`1976`
+
+Addded checks on number of items read in mdp statements
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Added checks for the number of items read in all
+sscanf() statements processing data from the mdp
+file.
+
+:issue:`1945`.
+
+Work around glibc 2.23 with CUDA
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+glibc 2.23 changed the behaviour of string.h in a way that broke all
+versions of CUDA with all gcc compiler versions. The GROMACS build
+system detects this glibc, and works around it by adding the
+_FORCE_INLINE preprocessor define to CUDA compilation.
+
+:issue:`1982`
+
+Split NBNXN CUDA kernels into four compilation units
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The CUDA nonbonded kernels are now built in four different compilation units
+when this is possible; ie. devices with compute capability >= 3.0. This
+can dramatically reduce compilation time.
+
+Forcing the use of a single compilation unit can be done using the
+GMX_CUDA_NB_SINGLE_COMPILATION_UNIT cmake option.
+
+:issue:`1444`
+
+Added stream flushes when not writing newline character
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Some of our routines use the carriage return without a newline
+to keep writing the status e.g. on stderr.
+For some operating systems this seems to lead to the output
+being cached in the buffers, so this change adds an explicit
+fflush() for these print stamements.
+
+Fixed :issue:`1772`
+
+Supported cmap with QMMM
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Formerly, QMMM only supported bonded interactions using up to 4 atoms.
+Now any number is supported and some hard-coded assumptions have been
+removed.
+
+Upgraded support for lmfit library
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Now based on lmfit 6.1. The CMake option GMX_EXTERNAL_LMFIT permits
+linking an external lmfit package, rather than the one bundled in
+GROMACS.
+
+:issue:`1957`
+
+libxml2 is no longer a dependency
+""""""""""""""""""""""""""""""""""""""""""""""""""""""
+GROMACS used to use libxml2 for running its test code. This has been
+replaced by a bundled version of tinyxml2 (or optionally, a system
+version of that library).
+
+Disable automated FFTW3 builds on Windows
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The FFTW distribution does not include configurations to
+build it automatically on windows, in particular not through
+the ``./configure; make; make install`` triad.
+
+:issue:`1961`
+
+Remove warnings on checkpoint mismatch
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+mdrun now only warns for mismatch in minor version, build or
+number of ranks used when reproducibility is requested.
+Also added a separate message for not matching precision.
+
+:issue:`1992`
+
+Report the filename and the line number on failure
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Extend the call to gmx_fatal in fget_lines() to report the filename and
+the line number where the read failed.
+
+Handled constraint errors with EM
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+All energy minimizers could fail with random errors when constraining
+produced NaN coordinates.
+Steepest descents now rejects steps with a constraint error.
+All other minimizer produce a fatal error with the suggestion to use
+steepest descents first.
+
+:issue:`1955`
+
+Disable static libcudart on OS X
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Recent versions of CMake enable a static version of
+libcudart by default, but this breaks builds at least
+on the most recent version (10.11) of OS X, so we
+disable it on this platform.
+
+Fixed rare issue linking with clock_gettime
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Misuse of preprocessing commands might have led to inappropriate
+use of clock_gettime().
+
+:issue:`1980`
+
+Disabled NVIDIA JIT cache with OpenCL
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The NVIDIA JIT caching is known to be broken with OpenCL compilation in
+the case when the kernel source changes but the path does not change
+(e.g. kernels get overwritten by a new installation). Therefore we disable
+the JIT caching when running on NVIDIA GPUs. AMD GPUs are unaffected.
+
+
+:issue:`1938`
+
diff --git a/docs/release-notes/2016/major/new-features.rst b/docs/release-notes/2016/major/new-features.rst

new file mode 100644 (file)

index 0000000..603737b
--- /dev/null
+++ b/docs/release-notes/2016/major/new-features.rst
@@ -0,0 +1,207 @@
+New and improved features
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Changed to require a C++11 compiler
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+GROMACS now requires both a C++11 and C99 compiler. For details, see
+the install guide.
+
+Changed to support only CUDA 5.0 and more recent versions
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+:issue:`1831`
+
+Allowed rcoulomb > rvdw with PME
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+GROMACS has had kernels that support Coulomb PME + cut-off LJ
+with rcoulomb > rvdw for a while, but these were only available via
+PME load balancing. Now we allow this setup to be chosen also
+through mdp options.
+
+Added optional support for portable hardware locality (hwloc)
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Added CMake support to detect and build GROMACS with hwloc, which
+will improve GROMACS ability to recognize and take advantage of all
+the available hardware. If hwloc is unavailable, GROMACS will fall back
+on other detection routines.
+
+Made normal-mode calculations work with shells and vsites
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Implemented shells and vsites in normal-mode analysis in mdrun and in
+analysis of eigenvalues and frequencies. The normal-mode analysis
+is done on real atoms only and the shells are minimized at each step
+of the analysis.
+
+:issue:`879`
+
+Changed pull group count for coords stored in tpr file
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Added a parameter ngroup to the pull coord parameters. This is now
+also stored in the tpr file. This makes the pull geometry forward
+compatible, which is useful since it avoid bumping the .tpr version
+with every new geometry, and we expect that users want to experiment
+with new geometries.
+
+Added pull coordinate geometry angle-axis
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The new geometry is described in the docs.
+Some checks in readpull.cpp where reorganized since adding new
+geometries made some old logic a bit convoluted.
+
+Added pull coordinate geometry dihedral (angle)
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+How to use the new geometry is explained in the docs.
+
+Added pull coordinate geometry angle
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+A new subsection was added to the docs explaining the new geometry.
+
+Replaced ``pull-print-com1,2`` mdp option with ``pull-print-com``
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Changes were made to the pull output order and naming.
+
+Added pull potential flat-bottom-high
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Added the new pull coordinate type flat-bottom-high, which is a flat
+potential above the reference value and harmonic below.
+
+Added ``gmx grompp`` check for pull group
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Added a check for valid pull groups in a pull coordinate.
+Using a pull group index that was out of range would cause invalid
+memory access.
+
+Added new swapping functionality to computational electrophysiology module
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Support was added for ion/water position swapping for multiple ion
+types and polyatomic ions, including use of a user-defined number of
+ionic species, and (small) polyatomic ions.
+
+Also added two extra .mdp file parameters 'bulk-offset' that allow the
+user to specify an offset of the swap layers from the compartment
+midplanes. This is useful for setups where e.g. a transmembrane
+protein extends far into at least one of the compartments. Without an
+offset, ions would be swapped in the vicinity of the protein, which is
+not wanted. Adding an extended water layer comes at the cost of
+performance, which is not the case for the offset solution.
+
+Documentation and testing was improved.
+
+Fixed logic for DD missing-interactions check
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The code that was intended to double check that the domain decomposition
+algorithm has not missed any interactions was inactive in several
+cases, and has been fixed.
+
+:issue:`1882`, :issue:`1793`
+
+Permitted forces and velocities to be written to compressed TNG
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+If there is no uncompressed coordinate output, write forces
+and velocities to the TNG file with compressed coordinate
+output. If there is uncompressed coordinate output to a
+TNG file, forces and velocities will be written to it.
+
+Use a greatest common divisor to set the frequency of some TNG
+data output to ensure lambdas and box shape are written at least
+as often as anything else.
+
+:issue:`1863`
+
+Added new notes to the user when coupling algorithms are unavailable
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+mdrun will now give the user an explanatory note when pressure and/or
+temperature coupling is turned off.
+
+Added mdrun check for finite energies
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Added a check that the total potential energy is finite. This check is
+nearly free and can catch issues with incorrectly set up systems
+before users get a confusing constraint or PME error. Note that this
+check is only performed at steps where energies are calculated, so it
+will often not catch an exploding system.
+
+Added ``gmx grompp`` check for unbound atoms
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+``gmx grompp`` now prints a note for atoms that are not connected by a
+potential or constraint to any other atom in the same moleculetype,
+since this often means the user made a mistake.
+
+:issue:`1958`
+
+Improved multi-simulation signalling
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Multi-simulations (including REMD) may have need to send messages
+between the simulations. For example, REMD needs to write a
+fully-consistent set of checkpoint files so that the restart works
+correctly, but normal multi-simulations are fine with decoupled
+progress and will simulate more efficiently if they can do
+so. Similarly, ``gmx_mpi mdrun -maxh -multi`` needs to synchronize
+only for REMD. The implementation has been upgraded so that such
+coupling happens only when an algorithm chosen by the user requires
+it.
+
+:issue:`860`, :issue:`692`, :issue:`1857`, :issue:`1942`
+
+Changed multi-simulation nsteps behaviour
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""-
+
+It is unclear what the expected behaviour of a multi-simulation should
+be if the user supplies any of the possible non-uniform distributions
+of init_step and nsteps, sourced from any of .mdp, .cpt or command
+line. Previously mdrun adjusted the total number of stesps to run so
+that each run did the same number of steps, but now it reports on the
+non-uniformity and proceed, assuming the user knows what they are
+doing.
+
+:issue:`1857`
+
+Added working directory to things reported in .log file
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+When running GROMACS via a batch script, it is useful to know which
+working directory is being used for relative paths (file names) in the
+command line. This is now written alongside other header information.
+
+Prevented fragile use cases involving checkpoint restarts and/or appending
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+All output files named in the checkpoint file (ie. that were
+used in the previous run) must be present before a checkpoint
+restart will be permitted. Thus,
+workflows where people used things like
+``gmx mdrun -s production -cpi equilibration``
+are no longer available to do a "continuous" restart. Instead, use
+``gmx grompp -t equilibration -o production``.
+
+:issue:`1777`
+
+Removed warning after OpenMP core-count check
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+In many cases ``gmx_mpi mdrun`` issued a warning that compared the total
+core count with something different returned from OpenMP. This problem
+is caused by inappropriate management of thread affinity masks, but
+the wording of the message did not help the user realise this, so has
+been removed. ``gmx_mpi mdrun -pin on`` may help improve performance in
+such cases.
+
+Preparation for hardware detection might try to force offline cores to work
+"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Hardware detection might be foiled by kernels that take cores offline
+when work is unavailable. We are not aware of any such platforms on which
+GROMACS is likely to be used, but we will probably start to see them
+soon. On such platforms, if the number of cores physically present
+differs from the count that are online, we try to force them online
+before deciding how GROMACS will use the online cores. For now, no x86 or
+PowerPC platforms need such code, so it will never run on those platforms.
+The good news is that we no longer have to risk making a confusing warning
+about such possibilities.
+
+Added new suggestion for users to try e.g. hyper-threading, if its disabled
+"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+GROMACS tends to perform best with several hardware threads available
+per core (e.g. hyper-threading turned on, on x86), and now the log file
+will note explicitly when such opportunities exist.
+
+
diff --git a/docs/release-notes/2016/major/performance.rst b/docs/release-notes/2016/major/performance.rst

new file mode 100644 (file)

index 0000000..020795a
--- /dev/null
+++ b/docs/release-notes/2016/major/performance.rst
@@ -0,0 +1,162 @@
+Performance improvements
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+GPU improvements
+^^^^^^^^^^^^^^^^
+
+In addition to those noted below, overall minor improvements contribute
+up to 5% increase in CUDA performance, so depending on parameters and compilers
+an 5-20% GPU kernel performance increase is expected.
+These benefits are seen with CUDA 7.5 (which is now the version we recommend);
+certain older versions (e.g. 7.0) see even larger improvements.
+
+Even larger improvements in OpenCL performance on AMD devices are
+expected, e.g. can be >50% with RF/plain cut-off and PME with potential shift
+with recent AMD OpenCL compilers. 
+
+Note that due to limitations of the NVIDIA OpenCL compiler CUDA is still superior
+in performance on NVIDIA GPUs. Hence, it is recommended to use CUDA-based GPU acceleration
+on NVIDIA hardware.
+
+
+Improved support for OpenCL devices
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The OpenCL support is now fully compatible with all intra- and
+inter-node parallelization mode, including MPI, thread-MPI, and GPU
+sharing by PP ranks. (The previous limitations were caused by bugs in high-level
+GROMACS code.)
+
+Additionally some prefetching in the short-ranged kernels (similar to
+that in the CUDA code) that had been disabled was found to be useful
+after all.
+
+Added Lennard-Jones combination-rule kernels for GPUs
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Implemented LJ combination-rule parameter lookup in the CUDA and
+OpenCL kernels for both geometric and Lorentz-Berthelot combination
+rules, and enabled it for plain LJ cut-off. This optimization was
+already present in the CPU kernels. This improves performance with
+e.g. OPLS, GROMOS and AMBER force fields by about 10-15% (but does not
+help with CHARMM force fields because they use force-switched kernels).
+
+Added support for CUDA CC 6.0/6.1
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Added build-system and kernel-generator support for the Pascal
+architectures announced so far (GP100: 6.0, GP104: 6.1) and supported
+by the CUDA 8.0 compiler.
+
+By default we now generate binary as well as PTX code for both sm_60 and
+sm_61 and given the considerable differences between the two, we also
+generate PTX for both virtual arch. For now we don't add CC 6.2 (GP102)
+compilation support as we know nothing about it.
+
+On the kernel-generation side, given the increased register file, for
+CC 6.0 the "wider" 128 threads/block kernels are enabled, on 6.1 and
+later the 64 threads/block remains.
+
+Improved GPU pair-list splitting to improve performance
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Instead of splitting the GPU lists (to generate more work units) based
+on a maximum cut-off, we now generate lists as close to the target
+list size as possible. The heuristic estimate for the number of
+cluster pairs is now too high by 0-1% instead of 10%. This results in
+a few percent fewer pair lists, but still slightly more than
+requested.
+
+Improved CUDA GPU memory configuration
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+This makes use of the larger amount of L1 cache
+available for global load caching on hardware that supports it (K40,
+K80, Tegra K1, & CC 5.2) by passing the appropriate command line
+option ("-dlcm=ca").
+
+:issue:`1804`
+
+Automatic nstlist changes were tuned for Intel Knight's Landing
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+CPU improvements
+^^^^^^^^^^^^^^^^
+
+These improvements to individual kernels will provide incremental
+improvements to CPU performance for simulations where they are active,
+but their value for simulations using GPU offload are much higher,
+because via the auto-tuning, they permit all kinds of resource
+utilization and throughput to increase.
+
+Optimized the bonded thread force reduction
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The code for multi-threading of bonded interactions has to combine the
+forces afterwards. This reduction now uses fixed-size blocks of 32
+atoms, and instead of dividing reduction of the whole range of blocks
+uniformly over the threads, now only used blocks are divided
+(uniformly) over the threads.  This speeds up the reduction by a
+factor of the number of threads (!) for typical protein+water systems
+when not using domain decomposition. With domain decomposition, the
+speed up is up to a factor of 3.
+
+Used SIMD transpose-scatter in bonded force reduction
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The angle and dihedral SIMD functions now use the SIMD transpose
+scatter functions for force reduction. This change gives a massive
+performance improvement for bondeds, mainly because the dihedral
+force update did a lot of vector operations without SIMD that are
+now fully replaced by SIMD operations.
+
+Added SIMD implementation of Lennard-Jones 1-4 interactions
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""-
+The gives a few factors speed improvement. The main improvement comes
+from simplified analytical LJ instead of tables; SIMD helps a bit.
+
+Added SIMD implementation of SETTLE
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+On Haswell CPUs, this makes SETTLE a factor 5 faster.
+
+Added SIMD support for routines that do periodic boundary coordinate transformations
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Threading improvements
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+These improvements enhance the performance of code that runs over
+multiple CPU threads.
+
+Improved Verlet-scheme pair-list workload balancing
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Implemented near perfect load-balancing for Verlet-scheme CPU
+pair-lists. This increases the search cost by 3%, but this is
+outweighed by the more balanced non-bonded kernel times, particularly
+for small systems.
+
+Improved the threading of virtual-site code
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+On many threads, a significant part of the vsites would end up in
+the separate serial task, thereby limiting scaling. Now two weakly
+dependent tasks are generated for each thread and one of them uses
+a thread-local force buffer, parts of which are reduced by different
+threads that are responsible for those parts.
+
+Also the setup now runs multi-threaded.
+
+Add OpenMP support to more loops
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Loops over number of atoms cause significant amount of serial time with
+large number of threads, which limits scaling.
+
+Add OpenMP parallelization for the pull code
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The pull code could take up to a third of the compute time for OpenMP
+parallel simulation with large pull groups.
+Now all pull-code loops over atoms have an OpenMP parallel version.
+
+Other improvements
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Multi-simulations are coupled less frequently
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+For example, replica-exchange simulations communicate between simulations
+only at exchange attempts. Plain multi-simulations do not communicate
+between simulations. Overall performance will tend to improve any time
+the progress of one simulation might be faster than others (e.g. it's
+at a different pressure, or using a quieter part of the network).
diff --git a/docs/release-notes/2016/major/removed-features.rst b/docs/release-notes/2016/major/removed-features.rst

new file mode 100644 (file)

index 0000000..940f60b
--- /dev/null
+++ b/docs/release-notes/2016/major/removed-features.rst
@@ -0,0 +1,53 @@
+Removed mdrun features
+^^^^^^^^^^^^^^^^^^^^^^
+
+Removed SD2 integrator
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+This integrator has known problems, and is in all ways inferior to
+sd. It has no tests, and was deprecated in GROMACS 5.0. There are no
+plans to replace it.
+
+:issue:`1137`
+
+Removed the twin-range scheme
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Only the (deprecated) group scheme supports this, and the Verlet scheme will not
+support it in the foreseeable future.  There is now the explicit
+requirement that rlist >= max(rcoulomb,rvdw).
+
+Removed support for twin-range with VV integrators
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Group-scheme twin-ranged non-bonded interactions never worked with
+velocity-Verlet integrators and constraints. There are no plans to
+make that combination work.
+
+:issue:`1137`, :issue:`1793`
+
+Removed Reaction-Field-nec
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+The reaction-field no-exclusion correction option was only introduced for
+backward compatibility and a performance advantage for systems
+with only rigid molecules (e.g. water). For all other systems
+the forces are incorrect. The Verlet scheme does not support this
+option and even if it would, it wouldn't even improve performance.
+
+Removed AdResS module
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+This feature requires the (deprecated) group scheme, and there are no
+plans to port it to the Verlet scheme.
+
+:issue:`1852`
+
+Removed mdrun -compact
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+It is too complicated to support multiple ways of analysing per-step
+data.
+
+Removed lambda printing from mdrun log file
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+:issue:`1773`
+
+Removed GMX_NOCHARGEGROUPS
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+This undocumented feature was only useful with the (deprecated) group
+scheme.
diff --git a/docs/release-notes/2016/major/tools.rst b/docs/release-notes/2016/major/tools.rst

new file mode 100644 (file)

index 0000000..685b14b
--- /dev/null
+++ b/docs/release-notes/2016/major/tools.rst
@@ -0,0 +1,88 @@
+Improvements to GROMACS tools
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Supported replacing solvent in ``gmx insert-molecules``
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Make it possible to specify the solvent (or other set of atoms) with
+``-replace`` (as a selection) for ``gmx insert-molecules``, and make the tool
+replace residues from this set with the inserted molecules, instead of
+not inserting there. It is assumed that the solvent consists of
+single-residue molecules, since molecule information would require a tpr
+input, which might not be commonly available when preparing the system.
+
+Default random seeds have changed for some analysis tools
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+See individual tools documentation for their functionality. In some
+cases, the magic value to obtain a generated seed has changed (or is
+now documented.)
+
+Made ``gmx solvate`` and ``gmx insert-molecules`` work better with PDB inputs
+"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+When both ``-f`` and ``-o`` were .pdb files, the pdbinfo struct got
+out-of-sync when the atoms were added/removed.
+
+:issue:`1887`
+
+Tools in the new analysis framework can read trajectory files with subsets
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Make tools written for the new C++ analysis framework support analyzing
+trajectories that contain an arbitrary subset of atoms.
+
+:issue:`1861`
+
+Made moleculetype name case sensitive
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+This is useful in case you have more than 36 chains in your system
+with chain IDs set. PDB allows using both uppercase letters, lowercase
+letters and numbers for chain identifiers. Now we can use the maximum
+of 62 chains.
+
+Added number density normalization option for ``gmx rdf``
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Add an option to ``gmx rdf`` that allows selecting a radial number density
+as the normalization for the output (in addition to current raw
+neighbor counts and the actual RDF).
+
+Simplified ``gmx genconf`` by removing ``-block``, ``-sort`` and ``-shuffle``
+"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Option ``-block`` isn't useful since particle decomposition was removed.
+Options ``-sort`` and ``-shuffle`` were undocumented and don't seem very
+useful - these days they would be somebody's simple python script.
+
+Used macros for units and conversions in ``gmx wham``
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Also :issue:`1841`
+
+Improved ``gmx sasa`` error message
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Print more information when an output group is not part of the group
+selected for calculation, which should help the user diagnosing the issue.
+
+Made ``gmx vanhove`` work without PBC
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+Fix ``gmx hbond`` group overlap check
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+``gmx hbond`` does not support partially overlapping analysis groups.
+The check in the code was broken and never caught this, resulting
+incorrect output that might OK at first sight.
+Also corrected bitmasks = enums that (intentionally?) seemed to give
+correct results by not using non power of 2 enum index entries.
+
+Made ``gmx dos`` work again.
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Due to an error in the index handling ``gmx dos`` always stopped with a fatal
+error.
+
+:issue:`1996`
+
+Add checks for too much memory in ``gmx nmeig``
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+``gmx nmeig`` could request storage for eigenvector output and matrices
+for more than ``INT_MAX`` elements, but nearly all loop variables are int.
+Now a fatal error is produced in this case. This also avoids the
+confusing error message when too much memory is requested; the allocation
+routine will get the correct size, but gmx_fatal prints it as a smaller
+integer.
+Added support for ``-first`` > 1 with sparse matrices.
+
diff --git a/docs/release-notes/2018/2018.1.rst b/docs/release-notes/2018/2018.1.rst

index 8f3aa9dc2c6c12cff4d499334a7bf4a0cfb85ca3..fc2ba289e02cfee299dabe283d7f2f8834e1607c 100644 (file)
--- a/docs/release-notes/2018/2018.1.rst
+++ b/docs/release-notes/2018/2018.1.rst
@@ -1,41 +1,44 @@
  GROMACS 2018.1 release notes
-============================
+----------------------------
  
-This version was released on FIX ME WHEN RELEASING. These release
-notes document the changes that have taken place in GROMACS since the
+This version was released on February 23, 2018. These release notes
+document the changes that have taken place in GROMACS since the
  initial version 2018, to fix known issues. It also incorporates all
-fixes made in version 2016.5.
+fixes made in version 2016.5 and earlier, which you can find described
+in the :ref:`release-notes`.
+
+Fixes where mdrun could behave incorrectly
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  
  Used SIMD bondeds without perturbed interactions
---------------------------------------------------------------------------
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
  In free-energy calculations that lacked bonded interactions between
  perturbed atom types, the SIMD-accelerated bonded functions were
  inadvertently disabled. This has been enabled, which will improve
  the performance of some kinds of free-energy calculations.
  
  Fixed bonds whose displacement was zero
---------------------------------------------------------------------------
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
  We should allow overlapping atoms in harmonic bonds. But the former
  code would cause a floating point exception and incorrect free-energy
  derivatives.
  
  Fixed centre-of-mass motion removal on part of the system
---------------------------------------------------------------------------
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
  COMM removal requested for part of the system acted on the whole
  system.
  
  :issue:`2381`
  
-Added check in grompp to avoid assertion failure
---------------------------------------------------------------------------
-With an mdp file with a parameter present with both the current name
-and the old name which automatically gets replaced, an assertion
-would fail. Now a fatal error is issued.
+Fixed handling of mdp ``define`` statement assigning preprocessor values
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Now .mdp files can configure the topology with values, as originally
+intended, e.g. ``"define = -DBOOL -DVAR=VALUE"``.
  
-:issue:`2386`
+:issue:`2392`
  
  Prevented log file energy average printing dividing by zero
---------------------------------------------------------------------------
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
  If very few simulation frames have computed energies, then there may
  be insufficient data for averages. If so, skip the average printing
  entirely.
@@ -43,7 +46,7 @@ entirely.
  :issue:`2394`
  
  Correctly set cutoff modifiers in forcerec
---------------------------------------------------------------------------
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
  The cutoff modifiers were not copied from interaction_const_t
  to forcerec_t which meant only the generic kernels were used with
  the group scheme. This fix will restore the performance of the
@@ -52,36 +55,53 @@ group scheme.
  :issue:`2399`
  
  Fixed box scaling in PME mixed mode using both GPU and CPU
---------------------------------------------------------------------------
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
  
  :issue:`2385`
  
  Re-enabled GPU support with walls and 1 energy group
---------------------------------------------------------------------------
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
  With a single non-bonded energy group and walls, we can now use a GPU
  for non-bonded calculations.
  
  Removed tumbling ice-cube warning with SD integrator
---------------------------------------------------------------------------
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
  With SD, there is friction, so ice cubes will not tumble.
  
  Fixed assertion failure in test-particle insertion
---------------------------------------------------------------------------
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
  Erroneous logic in the TPI meant that it always failed without producing
  any result.
  
  :issue:`2398`
  
  Avoided mdrun echoing "No option -multi"
---------------------------------------------------------------------------
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
  mdrun would print as many messages "No option -multi" as there
  are MPI ranks to stderr.
  Also updated -multi to -multidir in an error message.
  
  :issue:`2377`
  
+Fixes for ``gmx`` tools
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Added check in grompp to avoid assertion failure
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+With an mdp file with a parameter present with both the current name
+and the old name which automatically gets replaced, an assertion
+would fail. Now a fatal error is issued.
+
+:issue:`2386`
+
+Fixes to improve portability
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+GoogleTest death tests are now used in a more portable way
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
  Used more portable python shebangs
---------------------------------------------------------------------------
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
  Per https://www.python.org/dev/peps/pep-0394/#recommendation, we
  should use env, and point it at python2. When we either make them 2/3
  or just-3 compatible, this should change.
@@ -92,12 +112,12 @@ we should choose to be explicit, and thus somewhat portable.
  :issue:`2401`
  
  Added work-around for GCC 5.3 targetting AVX512 hardware
---------------------------------------------------------------------------
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
  GCC 5.3 has bug in overload resolution causing the AVX512
  and scalar function to become ambiguous.
  
  Used isfinite unambiguously
---------------------------------------------------------------------------
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
  Patch provdied by Veselin Kolev to quiet some compiler warnings.
  
  :issue:`2400`
diff --git a/docs/release-notes/index.rst b/docs/release-notes/index.rst

index f637d8619610ffcfeb98717ba7a29f9304f801ae..87bca1e09c2c0899ddb7a2dfdb56d742c387af11 100644 (file)
--- a/docs/release-notes/index.rst
+++ b/docs/release-notes/index.rst
@@ -1,3 +1,5 @@
+.. _release-notes:
+
  Release notes
  =============
  
@@ -9,15 +11,17 @@ issues identified in the corresponding major releases.
  Two versions of |Gromacs| are under active maintenance, the 2018
  series and the 2016 series. In the latter, only highly conservative
  fixes will be made, and only to address issues that affect scientific
-correctness. Such fixes will also be incorporated into the 2018
-release series, as appropriate. Around the time the 2019 release is
-made, the 2016 series will no longer be maintained.
+correctness. Naturally, some of those releases will be made after the
+year 2016 ends, but we keep 2016 in the name so users understand how
+up to date their version is. Such fixes will also be incorporated into
+the 2018 release series, as appropriate. Around the time the 2019
+release is made, the 2016 series will no longer be maintained.
  
  Where issue numbers are reported in these release notes, more details
  can be found at https://redmine.gromacs.org at that issue number.
  
-|Gromacs| 2018 series release notes
------------------------------------
+|Gromacs| 2018 series
+---------------------
  
  Patch releases
  ^^^^^^^^^^^^^^
@@ -42,7 +46,40 @@ Major release
     2018/major/portability
     2018/major/miscellaneous
  
-|Gromacs| 2016 series release notes
------------------------------------
+|Gromacs| 2016 series
+---------------------
+
+Patch releases
+^^^^^^^^^^^^^^
+
+.. toctree::
+   :maxdepth: 1
+
+   2016/2016.5
+   2016/2016.4
+   2016/2016.3
+   2016/2016.2
+   2016/2016.1
+
+
+Major release
+^^^^^^^^^^^^^
+
+.. toctree::
+   :maxdepth: 1
+
+   2016/major/highlights
+   2016/major/new-features
+   2016/major/performance
+   2016/major/tools
+   2016/major/bugs-fixed
+   2016/major/removed-features
+   2016/major/miscellaneous
+
+Older (unmaintained) |Gromacs| series
+-------------------------------------------------------
+
+.. toctree::
+   :maxdepth: 1
  
-Coming soon!
+   older/index
diff --git a/docs/release-notes/older/index.rst b/docs/release-notes/older/index.rst

new file mode 100644 (file)

index 0000000..cc5ae56
--- /dev/null
+++ b/docs/release-notes/older/index.rst
@@ -0,0 +1,19 @@
+.. _older-release-notes:
+
+Release notes for older |Gromacs| versions
+==========================================
+
+Unfortunately, resources are finite and many versions of |Gromacs| are
+no longer actively maintained. This page records the release notes for
+all such versions, so that users can find a record of the changes made
+in all major and patch releases of |Gromacs|. Major releases contain
+changes to the functionality supported, whereas patch releases contain
+only fixes for issues identified in the corresponding major releases.
+
+Where issue numbers are reported in these release notes, more details
+can be found at https://redmine.gromacs.org at that issue number.
+
+|Gromacs| 5.1 series
+-----------------------------------
+
+TODO coming soon
diff --git a/src/gromacs/fileio/readinp.cpp b/src/gromacs/fileio/readinp.cpp

index 64b8be9f54d5d9ad6ebaafbe22a8f51a867856bb..2f4bb4ad50e6a783bf9bfa0f1e43d01ccbf60825 100644 (file)
--- a/src/gromacs/fileio/readinp.cpp
+++ b/src/gromacs/fileio/readinp.cpp
@@ -95,12 +95,15 @@ t_inpfile *read_inpfile(gmx::TextInputStream *stream, const char *fn, int *ninp,
          }
          if (tokens.size() > 2)
          {
-            // TODO ignoring such lines does not seem like good behaviour
-            if (debug)
-            {
-                fprintf(debug, "Multiple equals signs on line %d in file %s, ignored\n", indexOfLineReadFromFile, fn);
-            }
-            continue;
+            // More than one equals symbol in the original line is
+            // valid if the RHS is a free string, and needed for
+            // "define = -DBOOLVAR -DVAR=VALUE".
+            //
+            // First, drop all the fields on the RHS of the first equals symbol.
+            tokens.resize(1);
+            // This find cannot return std::string::npos.
+            auto firstEqualsPos = line.find('=');
+            tokens.emplace_back(gmx::stripString(line.substr(firstEqualsPos + 1)));
          }
          if (tokens[0].empty())
          {
diff --git a/src/gromacs/gmxpreprocess/tests/readir.cpp b/src/gromacs/gmxpreprocess/tests/readir.cpp

index 4b72470dec36f67b769cde621b5124d91d018a37..6c07d2eab4ea9ae010d55de1195806bb1d15a015 100644 (file)
--- a/src/gromacs/gmxpreprocess/tests/readir.cpp
+++ b/src/gromacs/gmxpreprocess/tests/readir.cpp
@@ -157,6 +157,14 @@ TEST_F(GetIrTest, UserErrorsSilentlyTolerated)
      runTest(joinStrings(inputMdpFile, "\n"));
  }
  
+TEST_F(GetIrTest, DefineHandlesAssignmentOnRhs)
+{
+    const char *inputMdpFile[] = {
+        "define = -DBOOL -DVAR=VALUE",
+    };
+    runTest(joinStrings(inputMdpFile, "\n"));
+}
+
  TEST_F(GetIrTest, EmptyInputWorks)
  {
      const char *inputMdpFile = "";
diff --git a/src/gromacs/gmxpreprocess/tests/refdata/GetIrTest_DefineHandlesAssignmentOnRhs.xml b/src/gromacs/gmxpreprocess/tests/refdata/GetIrTest_DefineHandlesAssignmentOnRhs.xml

new file mode 100644 (file)

index 0000000..95038d7
--- /dev/null
+++ b/src/gromacs/gmxpreprocess/tests/refdata/GetIrTest_DefineHandlesAssignmentOnRhs.xml
@@ -0,0 +1,321 @@
+<?xml version="1.0"?>
+<?xml-stylesheet type="text/xsl" href="referencedata.xsl"?>
+<ReferenceData>
+  <Bool Name="Error parsing mdp file">false</Bool>
+  <String Name="OutputMdpFile">
+; VARIOUS PREPROCESSING OPTIONS
+; Preprocessor information: use cpp syntax.
+; e.g.: -I/home/joe/doe -I/home/mary/roe
+include                  = 
+; e.g.: -DPOSRES -DFLEXIBLE (note these variable names are case sensitive)
+define                   = -DBOOL -DVAR=VALUE
+
+; RUN CONTROL PARAMETERS
+integrator               = md
+; Start time and timestep in ps
+tinit                    = 0
+dt                       = 0.001
+nsteps                   = 0
+; For exact run continuation or redoing part of a run
+init-step                = 0
+; Part index is updated automatically on checkpointing (keeps files separate)
+simulation-part          = 1
+; mode for center of mass motion removal
+comm-mode                = Linear
+; number of steps for center of mass motion removal
+nstcomm                  = 100
+; group(s) for center of mass motion removal
+comm-grps                = 
+
+; LANGEVIN DYNAMICS OPTIONS
+; Friction coefficient (amu/ps) and random seed
+bd-fric                  = 0
+ld-seed                  = -1
+
+; ENERGY MINIMIZATION OPTIONS
+; Force tolerance and initial step-size
+emtol                    = 10
+emstep                   = 0.01
+; Max number of iterations in relax-shells
+niter                    = 20
+; Step size (ps^2) for minimization of flexible constraints
+fcstep                   = 0
+; Frequency of steepest descents steps when doing CG
+nstcgsteep               = 1000
+nbfgscorr                = 10
+
+; TEST PARTICLE INSERTION OPTIONS
+rtpi                     = 0.05
+
+; OUTPUT CONTROL OPTIONS
+; Output frequency for coords (x), velocities (v) and forces (f)
+nstxout                  = 0
+nstvout                  = 0
+nstfout                  = 0
+; Output frequency for energies to log file and energy file
+nstlog                   = 1000
+nstcalcenergy            = 100
+nstenergy                = 1000
+; Output frequency and precision for .xtc file
+nstxout-compressed       = 0
+compressed-x-precision   = 1000
+; This selects the subset of atoms for the compressed
+; trajectory file. You can select multiple groups. By
+; default, all atoms will be written.
+compressed-x-grps        = 
+; Selection of energy groups
+energygrps               = 
+
+; NEIGHBORSEARCHING PARAMETERS
+; cut-off scheme (Verlet: particle based cut-offs, group: using charge groups)
+cutoff-scheme            = Verlet
+; nblist update frequency
+nstlist                  = 10
+; ns algorithm (simple or grid)
+ns-type                  = Grid
+; Periodic boundary conditions: xyz, no, xy
+pbc                      = xyz
+periodic-molecules       = no
+; Allowed energy error due to the Verlet buffer in kJ/mol/ps per atom,
+; a value of -1 means: use rlist
+verlet-buffer-tolerance  = 0.005
+; nblist cut-off        
+rlist                    = 1
+; long-range cut-off for switched potentials
+
+; OPTIONS FOR ELECTROSTATICS AND VDW
+; Method for doing electrostatics
+coulombtype              = Cut-off
+coulomb-modifier         = Potential-shift-Verlet
+rcoulomb-switch          = 0
+rcoulomb                 = 1
+; Relative dielectric constant for the medium and the reaction field
+epsilon-r                = 1
+epsilon-rf               = 0
+; Method for doing Van der Waals
+vdw-type                 = Cut-off
+vdw-modifier             = Potential-shift-Verlet
+; cut-off lengths       
+rvdw-switch              = 0
+rvdw                     = 1
+; Apply long range dispersion corrections for Energy and Pressure
+DispCorr                 = No
+; Extension of the potential lookup tables beyond the cut-off
+table-extension          = 1
+; Separate tables between energy group pairs
+energygrp-table          = 
+; Spacing for the PME/PPPM FFT grid
+fourierspacing           = 0.12
+; FFT grid size, when a value is 0 fourierspacing will be used
+fourier-nx               = 0
+fourier-ny               = 0
+fourier-nz               = 0
+; EWALD/PME/PPPM parameters
+pme-order                = 4
+ewald-rtol               = 1e-05
+ewald-rtol-lj            = 0.001
+lj-pme-comb-rule         = Geometric
+ewald-geometry           = 3d
+epsilon-surface          = 0
+implicit-solvent         = no
+
+; OPTIONS FOR WEAK COUPLING ALGORITHMS
+; Temperature coupling  
+tcoupl                   = No
+nsttcouple               = -1
+nh-chain-length          = 10
+print-nose-hoover-chain-variables = no
+; Groups to couple separately
+tc-grps                  = 
+; Time constant (ps) and reference temperature (K)
+tau-t                    = 
+ref-t                    = 
+; pressure coupling     
+pcoupl                   = No
+pcoupltype               = Isotropic
+nstpcouple               = -1
+; Time constant (ps), compressibility (1/bar) and reference P (bar)
+tau-p                    = 1
+compressibility          = 
+ref-p                    = 
+; Scaling of reference coordinates, No, All or COM
+refcoord-scaling         = No
+
+; OPTIONS FOR QMMM calculations
+QMMM                     = no
+; Groups treated Quantum Mechanically
+QMMM-grps                = 
+; QM method             
+QMmethod                 = 
+; QMMM scheme           
+QMMMscheme               = normal
+; QM basisset           
+QMbasis                  = 
+; QM charge             
+QMcharge                 = 
+; QM multiplicity       
+QMmult                   = 
+; Surface Hopping       
+SH                       = 
+; CAS space options     
+CASorbitals              = 
+CASelectrons             = 
+SAon                     = 
+SAoff                    = 
+SAsteps                  = 
+; Scale factor for MM charges
+MMChargeScaleFactor      = 1
+
+; SIMULATED ANNEALING  
+; Type of annealing for each temperature group (no/single/periodic)
+annealing                = 
+; Number of time points to use for specifying annealing in each group
+annealing-npoints        = 
+; List of times at the annealing points for each group
+annealing-time           = 
+; Temp. at each annealing point, for each group.
+annealing-temp           = 
+
+; GENERATE VELOCITIES FOR STARTUP RUN
+gen-vel                  = no
+gen-temp                 = 300
+gen-seed                 = -1
+
+; OPTIONS FOR BONDS    
+constraints              = none
+; Type of constraint algorithm
+constraint-algorithm     = Lincs
+; Do not constrain the start configuration
+continuation             = no
+; Use successive overrelaxation to reduce the number of shake iterations
+Shake-SOR                = no
+; Relative tolerance of shake
+shake-tol                = 0.0001
+; Highest order in the expansion of the constraint coupling matrix
+lincs-order              = 4
+; Number of iterations in the final step of LINCS. 1 is fine for
+; normal simulations, but use 2 to conserve energy in NVE runs.
+; For energy minimization with constraints it should be 4 to 8.
+lincs-iter               = 1
+; Lincs will write a warning to the stderr if in one step a bond
+; rotates over more degrees than
+lincs-warnangle          = 30
+; Convert harmonic bonds to morse potentials
+morse                    = no
+
+; ENERGY GROUP EXCLUSIONS
+; Pairs of energy groups for which all non-bonded interactions are excluded
+energygrp-excl           = 
+
+; WALLS                
+; Number of walls, type, atom types, densities and box-z scale factor for Ewald
+nwall                    = 0
+wall-type                = 9-3
+wall-r-linpot            = -1
+wall-atomtype            = 
+wall-density             = 
+wall-ewald-zfac          = 3
+
+; COM PULLING          
+pull                     = no
+
+; AWH biasing          
+awh                      = no
+
+; ENFORCED ROTATION    
+; Enforced rotation: No or Yes
+rotation                 = no
+
+; Group to display and/or manipulate in interactive MD session
+IMD-group                = 
+
+; NMR refinement stuff 
+; Distance restraints type: No, Simple or Ensemble
+disre                    = No
+; Force weighting of pairs in one distance restraint: Conservative or Equal
+disre-weighting          = Conservative
+; Use sqrt of the time averaged times the instantaneous violation
+disre-mixed              = no
+disre-fc                 = 1000
+disre-tau                = 0
+; Output frequency for pair distances to energy file
+nstdisreout              = 100
+; Orientation restraints: No or Yes
+orire                    = no
+; Orientation restraints force constant and tau for time averaging
+orire-fc                 = 0
+orire-tau                = 0
+orire-fitgrp             = 
+; Output frequency for trace(SD) and S to energy file
+nstorireout              = 100
+
+; Free energy variables
+free-energy              = no
+couple-moltype           = 
+couple-lambda0           = vdw-q
+couple-lambda1           = vdw-q
+couple-intramol          = no
+init-lambda              = -1
+init-lambda-state        = -1
+delta-lambda             = 0
+nstdhdl                  = 50
+fep-lambdas              = 
+mass-lambdas             = 
+coul-lambdas             = 
+vdw-lambdas              = 
+bonded-lambdas           = 
+restraint-lambdas        = 
+temperature-lambdas      = 
+calc-lambda-neighbors    = 1
+init-lambda-weights      = 
+dhdl-print-energy        = no
+sc-alpha                 = 0
+sc-power                 = 1
+sc-r-power               = 6
+sc-sigma                 = 0.3
+sc-coul                  = no
+separate-dhdl-file       = yes
+dhdl-derivatives         = yes
+dh_hist_size             = 0
+dh_hist_spacing          = 0.1
+
+; Non-equilibrium MD stuff
+acc-grps                 = 
+accelerate               = 
+freezegrps               = 
+freezedim                = 
+cos-acceleration         = 0
+deform                   = 
+
+; simulated tempering variables
+simulated-tempering      = no
+simulated-tempering-scaling = geometric
+sim-temp-low             = 300
+sim-temp-high            = 300
+
+; Ion/water position swapping for computational electrophysiology setups
+; Swap positions along direction: no, X, Y, Z
+swapcoords               = no
+adress                   = no
+
+; User defined thingies
+user1-grps               = 
+user2-grps               = 
+userint1                 = 0
+userint2                 = 0
+userint3                 = 0
+userint4                 = 0
+userreal1                = 0
+userreal2                = 0
+userreal3                = 0
+userreal4                = 0
+; Electric fields
+; Format for electric-field-x, etc. is: four real variables:
+; amplitude (V/nm), frequency omega (1/ps), time for the pulse peak (ps),
+; and sigma (ps) width of the pulse. Omega = 0 means static field,
+; sigma = 0 means no pulse, leaving the field to be a cosine function.
+electric-field-x         = 0 0 0 0
+electric-field-y         = 0 0 0 0
+electric-field-z         = 0 0 0 0
+</String>
+</ReferenceData>
diff --git a/src/gromacs/gpu_utils/gpu_utils.cu b/src/gromacs/gpu_utils/gpu_utils.cu

index 2df4709ab4755440375242777b30326fe470cbaa..7660d7cc249315dd5b2571a89c01ab862ae106c0 100644 (file)
--- a/src/gromacs/gpu_utils/gpu_utils.cu
+++ b/src/gromacs/gpu_utils/gpu_utils.cu
@@ -603,6 +603,11 @@ static bool is_gmx_supported_gpu(const cudaDeviceProp *dev_prop)
   *  errors: incompatibility, insistence, or insanity (=unexpected behavior).
   *  It also returns the respective device's properties in \dev_prop (if applicable).
   *
+ *  As the error handling only permits returning the state of the GPU, this function
+ *  does not clear the CUDA runtime API status allowing the caller to inspect the error
+ *  upon return. Note that this also means it is the caller's responsibility to
+ *  reset the CUDA runtime state.
+ *
   *  \param[in]  dev_id   the ID of the GPU to check.
   *  \param[out] dev_prop the CUDA device properties of the device checked.
   *  \returns             the status of the requested device
@@ -718,6 +723,9 @@ void findGpus(gmx_gpu_info_t *gpu_info)
                                       "canDetectGpus() was not called appropriately beforehand."));
      }
  
+    // We expect to start device support/sanity checks with a clean runtime error state
+    gmx::ensureNoPendingCudaError("");
+
      snew(devs, ndev);
      for (i = 0; i < ndev; i++)
      {
@@ -731,8 +739,27 @@ void findGpus(gmx_gpu_info_t *gpu_info)
          {
              gpu_info->n_dev_compatible++;
          }
+        else
+        {
+            // TODO:
+            //  - we inspect the CUDA API state to retrieve and record any
+            //    errors that occurred during is_gmx_supported_gpu_id() here,
+            //    but this would be more elegant done within is_gmx_supported_gpu_id()
+            //    and only return a string with the error if one was encountered.
+            //  - we'll be reporting without rank information which is not ideal.
+            //  - we'll end up warning also in cases where users would already
+            //    get an error before mdrun aborts.
+            //
+            // Here we also clear the CUDA API error state so potential
+            // errors during sanity checks don't propagate.
+            if ((stat = cudaGetLastError()) != cudaSuccess)
+            {
+                gmx_warning(gmx::formatString("An error occurred while sanity checking device #%d; %s: %s",
+                                              devs[i].id, cudaGetErrorName(stat), cudaGetErrorString(stat)).c_str());
+            }
+        }
      }
-    GMX_RELEASE_ASSERT(cudaSuccess == cudaPeekAtLastError(), "Should be cudaSuccess");
+    GMX_RELEASE_ASSERT(cudaSuccess == cudaPeekAtLastError(), "We promise to return with clean CUDA state!");
  
      gpu_info->n_dev   = ndev;
      gpu_info->gpu_dev = devs;
diff --git a/src/gromacs/gpu_utils/gpu_utils.h b/src/gromacs/gpu_utils/gpu_utils.h

index c364363ed6efe173ec0ef518e192c61906a98ecd..9634c380343e97ea0b0531dbcdf9c70627f78bb0 100644 (file)
--- a/src/gromacs/gpu_utils/gpu_utils.h
+++ b/src/gromacs/gpu_utils/gpu_utils.h
@@ -96,6 +96,11 @@ bool canDetectGpus(std::string *GPU_FUNC_ARGUMENT(errorMessage)) GPU_FUNC_TERM_W
   *  gpu_info->gpu_dev array with the required information on each the
   *  device: ID, device properties, status.
   *
+ *  Note that this function leaves the GPU runtime API error state clean;
+ *  this is implemented ATM in the CUDA flavor.
+ *  TODO: check if errors do propagate in OpenCL as they do in CUDA and
+ *  whether there is a mechanism to "clear" them.
+ *
   *  \param[in] gpu_info    pointer to structure holding GPU information.
   *
   *  \throws                InternalError if a GPU API returns an unexpected failure (because
diff --git a/src/gromacs/gpu_utils/tests/devicetransfers.cu b/src/gromacs/gpu_utils/tests/devicetransfers.cu

index e2d6a0d2bad543767c380f2642d06d6f5a392ab8..88a497c19ee3dd54c558875863941b7bb801f088 100644 (file)
--- a/src/gromacs/gpu_utils/tests/devicetransfers.cu
+++ b/src/gromacs/gpu_utils/tests/devicetransfers.cu
@@ -1,7 +1,7 @@
  /*
   * This file is part of the GROMACS molecular simulation package.
   *
- * Copyright (c) 2017, by the GROMACS development team, led by
+ * Copyright (c) 2017,2018, by the GROMACS development team, led by
   * Mark Abraham, David van der Spoel, Berk Hess, and Erik Lindahl,
   * and including many others, as listed in the AUTHORS file in the
   * top-level source directory and at http://www.gromacs.org.
@@ -49,6 +49,7 @@
  #include "devicetransfers.h"
  
  #include "gromacs/gpu_utils/cudautils.cuh"
+#include "gromacs/gpu_utils/gpu_utils.h"
  #include "gromacs/hardware/gpu_hw_info.h"
  #include "gromacs/utility/arrayref.h"
  #include "gromacs/utility/exceptions.h"
@@ -79,20 +80,21 @@ void doDeviceTransfers(const gmx_gpu_info_t &gpuInfo,
                         ArrayRef<char>        output)
  {
      GMX_RELEASE_ASSERT(input.size() == output.size(), "Input and output must have matching size");
-    if (gpuInfo.n_dev == 0)
+    const auto compatibleGpus = getCompatibleGpus(gpuInfo);
+    if (compatibleGpus.empty())
      {
          std::copy(input.begin(), input.end(), output.begin());
          return;
      }
      cudaError_t status;
  
-    const auto &device = gpuInfo.gpu_dev[0];
+    const auto *device = getDeviceInfo(gpuInfo, compatibleGpus[0]);
      int         oldDeviceId;
  
      status = cudaGetDevice(&oldDeviceId);
      throwUponFailure(status, "getting old device id");
-    status = cudaSetDevice(device.id);
-    throwUponFailure(status, "setting device id to 0");
+    status = cudaSetDevice(device->id);
+    throwUponFailure(status, "setting device id to the first compatible GPU");
  
      void       *devicePointer;
      status = cudaMalloc(&devicePointer, input.size());
diff --git a/src/gromacs/gpu_utils/tests/devicetransfers_ocl.cpp b/src/gromacs/gpu_utils/tests/devicetransfers_ocl.cpp

index ace54963b862992dc990d1e42f9e5b1aa7d1a853..7c0215ef26ee92479f1a66b90309dbce9e8e0949 100644 (file)
--- a/src/gromacs/gpu_utils/tests/devicetransfers_ocl.cpp
+++ b/src/gromacs/gpu_utils/tests/devicetransfers_ocl.cpp
@@ -1,7 +1,7 @@
  /*
   * This file is part of the GROMACS molecular simulation package.
   *
- * Copyright (c) 2017, by the GROMACS development team, led by
+ * Copyright (c) 2017,2018, by the GROMACS development team, led by
   * Mark Abraham, David van der Spoel, Berk Hess, and Erik Lindahl,
   * and including many others, as listed in the AUTHORS file in the
   * top-level source directory and at http://www.gromacs.org.
@@ -41,6 +41,7 @@
  #include "gmxpre.h"
  
  #include "gromacs/gpu_utils/gmxopencl.h"
+#include "gromacs/gpu_utils/gpu_utils.h"
  #include "gromacs/gpu_utils/oclutils.h"
  #include "gromacs/hardware/gpu_hw_info.h"
  #include "gromacs/utility/arrayref.h"
@@ -74,22 +75,23 @@ void doDeviceTransfers(const gmx_gpu_info_t &gpuInfo,
                         ArrayRef<char>        output)
  {
      GMX_RELEASE_ASSERT(input.size() == output.size(), "Input and output must have matching size");
-    if (gpuInfo.n_dev == 0)
+    const auto compatibleGpus = getCompatibleGpus(gpuInfo);
+    if (compatibleGpus.empty())
      {
          std::copy(input.begin(), input.end(), output.begin());
          return;
      }
      cl_int                status;
  
-    const auto           &device       = gpuInfo.gpu_dev[0];
+    const auto           *device       = getDeviceInfo(gpuInfo, compatibleGpus[0]);
      cl_context_properties properties[] = {
          CL_CONTEXT_PLATFORM,
-        (cl_context_properties) device.ocl_gpu_id.ocl_platform_id,
+        (cl_context_properties) device->ocl_gpu_id.ocl_platform_id,
          0
      };
      // Give uncrustify more space
  
-    auto deviceId = device.ocl_gpu_id.ocl_device_id;
+    auto deviceId = device->ocl_gpu_id.ocl_device_id;
      auto context  = clCreateContext(properties, 1, &deviceId, NULL, NULL, &status);
      throwUponFailure(status, "creating context");
      auto commandQueue = clCreateCommandQueue(context, deviceId, 0, &status);
diff --git a/src/gromacs/mdlib/update.cpp b/src/gromacs/mdlib/update.cpp

index 1cab82b2695e1726920a1e8ebdcbe6787d0eca3c..60896c1045bf14c35777ed6ccd978290bc47ae47 100644 (file)
--- a/src/gromacs/mdlib/update.cpp
+++ b/src/gromacs/mdlib/update.cpp
@@ -549,23 +549,34 @@ static void do_update_md(int                         start,
  
      if (doNoseHoover || doPROffDiagonal || doAcceleration)
      {
+        matrix stepM;
+        if (!doParrinelloRahman)
+        {
+            /* We should not apply PR scaling at this step */
+            clear_mat(stepM);
+        }
+        else
+        {
+            copy_mat(M, stepM);
+        }
+
          if (!doAcceleration)
          {
              updateMDLeapfrogGeneral<AccelerationType::none>
                  (start, nrend, doNoseHoover, dt, dtPressureCouple,
-                ir, md, ekind, box, x, xprime, v, f, nh_vxi, M);
+                ir, md, ekind, box, x, xprime, v, f, nh_vxi, stepM);
          }
          else if (ekind->bNEMD)
          {
              updateMDLeapfrogGeneral<AccelerationType::group>
                  (start, nrend, doNoseHoover, dt, dtPressureCouple,
-                ir, md, ekind, box, x, xprime, v, f, nh_vxi, M);
+                ir, md, ekind, box, x, xprime, v, f, nh_vxi, stepM);
          }
          else
          {
              updateMDLeapfrogGeneral<AccelerationType::cosine>
                  (start, nrend, doNoseHoover, dt, dtPressureCouple,
-                ir, md, ekind, box, x, xprime, v, f, nh_vxi, M);
+                ir, md, ekind, box, x, xprime, v, f, nh_vxi, stepM);
          }
      }
      else
diff --git a/src/programs/mdrun/runner.cpp b/src/programs/mdrun/runner.cpp

index c8d8b54c4c9eda1825f182f441e61f364640e10c..67824e3258e4705a9d4c2312dab72175a0124469 100644 (file)
--- a/src/programs/mdrun/runner.cpp
+++ b/src/programs/mdrun/runner.cpp
@@ -1041,7 +1041,14 @@ int Mdrunner::mdrunner()
      }
      if (isMultiSim(ms))
      {
-        MPI_Barrier(ms->mpi_comm_masters);
+        if (MASTER(cr))
+        {
+            MPI_Barrier(ms->mpi_comm_masters);
+        }
+        /* We need another barrier to prevent non-master ranks from contiuing
+         * when an error occured in a different simulation.
+         */
+        MPI_Barrier(cr->mpi_comm_mysim);
      }
  #endif
author	Aleksei Iupinov <a.yupinov@gmail.com>
	Tue, 27 Feb 2018 13:08:17 +0000 (14:08 +0100)
committer	Aleksei Iupinov <a.yupinov@gmail.com>
	Tue, 27 Feb 2018 14:05:53 +0000 (15:05 +0100)
admin/builds/pre-submit-matrix.txt		patch \| blob \| history
docs/CMakeLists.txt		patch \| blob \| history
docs/release-notes/2016/2016.1.rst	[new file with mode: 0644]	patch \| blob
docs/release-notes/2016/2016.2.rst	[new file with mode: 0644]	patch \| blob
docs/release-notes/2016/2016.3.rst	[new file with mode: 0644]	patch \| blob
docs/release-notes/2016/2016.4.rst	[new file with mode: 0644]	patch \| blob
docs/release-notes/2016/2016.5.rst	[new file with mode: 0644]	patch \| blob
docs/release-notes/2016/major/bugs-fixed.rst	[new file with mode: 0644]	patch \| blob
docs/release-notes/2016/major/highlights.rst	[new file with mode: 0644]	patch \| blob
docs/release-notes/2016/major/miscellaneous.rst	[new file with mode: 0644]	patch \| blob
docs/release-notes/2016/major/new-features.rst	[new file with mode: 0644]	patch \| blob
docs/release-notes/2016/major/performance.rst	[new file with mode: 0644]	patch \| blob
docs/release-notes/2016/major/removed-features.rst	[new file with mode: 0644]	patch \| blob
docs/release-notes/2016/major/tools.rst	[new file with mode: 0644]	patch \| blob
docs/release-notes/2018/2018.1.rst		patch \| blob \| history
docs/release-notes/index.rst		patch \| blob \| history
docs/release-notes/older/index.rst	[new file with mode: 0644]	patch \| blob
src/gromacs/fileio/readinp.cpp		patch \| blob \| history
src/gromacs/gmxpreprocess/tests/readir.cpp		patch \| blob \| history
src/gromacs/gmxpreprocess/tests/refdata/GetIrTest_DefineHandlesAssignmentOnRhs.xml	[new file with mode: 0644]	patch \| blob
src/gromacs/gpu_utils/gpu_utils.cu		patch \| blob \| history
src/gromacs/gpu_utils/gpu_utils.h		patch \| blob \| history
src/gromacs/gpu_utils/tests/devicetransfers.cu		patch \| blob \| history
src/gromacs/gpu_utils/tests/devicetransfers_ocl.cpp		patch \| blob \| history
src/gromacs/mdlib/update.cpp		patch \| blob \| history
src/programs/mdrun/runner.cpp		patch \| blob \| history