Erik Lindahl [Wed, 8 Jul 2015 14:59:06 +0000 (16:59 +0200)]
Move remaining C files in utility to C++
Added spaces between strings and literals. The string handling
code should eventually be replaced with proper C++ versions using
std::string, so for this reason I have not bothered improving them.
Change-Id: I78b871113f4a3891139a412c7831a1e5209bfaec
Roland Schulz [Sat, 11 Jul 2015 02:23:47 +0000 (22:23 -0400)]
Update kernel templates to match kernels
Change-Id: I20f41dc9b3e9da670e0c940697fa35d2473ab33b
Erik Lindahl [Tue, 7 Jul 2015 09:45:31 +0000 (11:45 +0200)]
Use cmakedefine01 for more trivial defines
This patch does the move for GMX_CXX11, HAVE_EXTRAE,
GMX_NO_RENAME, all the QMMM defines, GMX_X11, HAVE_POSIX_REGEX,
HAVE_CXX11_REGEX, HAVE_SIGUSR1, GMX_COOL_QUOTES, HAVE_PIPES,
HAVE_FEENABLEEXCEPT and HAVE_ZLIB.
Change-Id: I2c915c608d9fd16d584b605373c1151dcf3cc010
Erik Lindahl [Wed, 8 Jul 2015 23:47:28 +0000 (01:47 +0200)]
Moved tools files to C++
Added extern-c declarations to gmxcpp.h for calling
from the new C++ source.
Change-Id: I7b21398808a1d58849bff44690ece4d1f4e9db04
Teemu Murtola [Thu, 9 Jul 2015 16:52:54 +0000 (19:52 +0300)]
More gmxfio simplification
- Remove gmx_fio_setdebug() and all implementation related to it;
getting the source code location printed to stderr is not really
useful for errors that typically originate from the file system.
- Move gmx_fio_setprecision() to gmxfio-xdr, since it logically belongs
there.
- Remove bOpen, as the only case where any code could observe
bOpen==FALSE requires a use-after-free, and possibly also an ugly
race condition on top of that.
- Reduce use of iFTP member within gmxfio, where the purpose is to check
whether the fio->xdr member is not NULL.
- Remove checks from gmx_fio_open() that overlapped with those in
gmx_ffopen(). There are some corner cases that now work differently:
gmxfio no longer bypasses the check for .Z/.gz files for xdr files,
but lets things through to gmx_ffopen(), and with GMX_FAHCORE there is
no longer any kind of existence check for xdr files (it is
questionable whether the previous behavior was intentionally different
for xdr and non-xdr files, anyways).
- Remove dead code: fio->fp is always non-NULL if fio->xdr is.
Change-Id: Icd1588bfbb510543a55f57fb86e0e1e2a1fce236
Erik Lindahl [Thu, 9 Jul 2015 10:56:19 +0000 (12:56 +0200)]
Merge "Merge release-5-1 into master"
Roland Schulz [Thu, 9 Jul 2015 02:55:44 +0000 (22:55 -0400)]
New quotes
Change-Id: I57deba0f9f4b4cc5a2f5dc58e3d2c4fe4284691b
Roland Schulz [Wed, 8 Jul 2015 17:34:31 +0000 (13:34 -0400)]
Merge release-5-1 into master
Conflicts:
cmake/gmxVersionInfo.cmake
Change-Id: Iab02dc56743ef0c79e0a0c3f9e7cee7dc18442aa
Roland Schulz [Wed, 8 Jul 2015 07:35:13 +0000 (03:35 -0400)]
Merge release-5-0 into release-5-1
Conflicts:
src/gromacs/gmxana/gmx_dos.c
Change-Id: I06208bcafb880b9bac3f45050d709fae5eab9a9d
Erik Lindahl [Wed, 8 Jul 2015 10:57:35 +0000 (12:57 +0200)]
Move generalized born to C++
Change-Id: Ie4498eed9923b7c4df8589cc1abc75d92155abdb
Erik Lindahl [Mon, 6 Jul 2015 12:56:23 +0000 (14:56 +0200)]
Move FFT defines to use cmakedefine01
Change-Id: Ie9fface5ea2ec365aa740cfe60243d195e1039ba
Berk Hess [Fri, 27 Feb 2015 09:19:31 +0000 (10:19 +0100)]
Separated bonded and Ewald correction threading
The bonded and Ewald correction threading code used the same data
structures. These have now been separated and the bonded threading
data is opaque and defined in listed-internal.h.
This change is only code refactoring, except for the removal of
the thread local ewald correction force array and reduction, which
were never needed.
Change-Id: I162255107aa9e153a4c8ff41d786867d6fbed444
Erik Lindahl [Mon, 6 Jul 2015 13:19:40 +0000 (15:19 +0200)]
Move endian, float format and XDR settings to cmakedefine01
Also modify gmxtree.py to understand the new protocol
Change-Id: I68f1bfff9528d71d4a683082aaa4d637f8b10a64
Erik Lindahl [Tue, 7 Jul 2015 15:02:36 +0000 (17:02 +0200)]
Cleanup and remove unused SSE2 generalized born code
This code has been disabled for quite a while due to
a bug. Since we should anyway move to verlet-style
kernels there is no point in keeping these files around.
Change-Id: Idfd65ac2d0d9f304d548c97e4dbabbaf72df7a7b
Roland Schulz [Wed, 8 Jul 2015 07:28:27 +0000 (03:28 -0400)]
Merge release-4-6 into release-5-0
Change-Id: I25fea1226adfaa332c5c7b0630e99031266178f4
Erik Lindahl [Tue, 7 Jul 2015 20:02:24 +0000 (22:02 +0200)]
Move old neighborsearch code to C++
This will eventually be removed in favor of the new
verlet code, but for now it will make life simpler to
have everything moved to C++.
Change-Id: I3242e5f345769a42ffb99d1953bc1bef39324ff5
Erik Lindahl [Tue, 7 Jul 2015 20:50:41 +0000 (22:50 +0200)]
Move QMMM source to C++
Tested by enabling the four different QM/MM interfaces,
but not linking to the actual programs since those calls
are still unaltered.
Change-Id: I86163a0db56003687e6226a78169141af98940b7
Teemu Murtola [Sun, 5 Jul 2015 04:12:43 +0000 (07:12 +0300)]
Wrap uses of thread_mpi/mutex.h
Introduce gmx::Mutex and gmx::lock_guard to hide the actual C++ mutex
implementation used, and make all code that was using tMPI::Mutex use
these.
For now, the implementation is just directly imported from thread-MPI,
but this allows changing the implementation (e.g., to C++11 std::mutex
or to a TBB mutex) when thread-MPI is no longer appropriate, with mainly
changing this single header. Also, this makes it clearer to introduce
new C++ wrappers for things like tMPI_Lock_t, as they can now be written
in this wrapper layer, instead of in thread-MPI where they will
introduce additional such dependencies.
Change-Id: Ie4c91d4e74c5dbc5e4d2b5d7fb5a76b73ef5616b
Mark Abraham [Mon, 29 Jun 2015 22:49:57 +0000 (00:49 +0200)]
First 5.1 release candidate
Change-Id: I89776d6dd4d170d4f259fa8ee761cf4f6878cbe0
Berk Hess [Wed, 1 Jul 2015 13:46:40 +0000 (15:46 +0200)]
Fix too small GPU pair count estimates
For triclinic unit-cells with DD the non-local cluster pair count
estimate was too high, especially for thin local domains, due to an
incorrect estimate of the cluster size. Since the pair count estimate
for the local pair-list was determined as a total minus a non-local
estimate, the local estimate could get negative and cause exceptions.
Fixed the cluster size estimate and added a lower limit for the local
size estimate.
Fixes #1762.
Change-Id: I3489550968f66bc03ba4e6056017a58eba37f7cc
Berk Hess [Sat, 4 Jul 2015 08:18:05 +0000 (10:18 +0200)]
Fix bug in GPU list balancing
The function split_sci_entry could produce empty lists. This seems
not to have caused incorrect results, only slight extra processing
of empty workunits in the CUDA kernel. Incorrect Coulomb energies
could appear for empty lists with shift=CENTRAL, but that does not
seem to happen.
Refs #1767.
Change-Id: I0b0ff0a450734d4863f1e9636ff5741d4f1a68da
Teemu Murtola [Mon, 29 Jun 2015 10:43:11 +0000 (13:43 +0300)]
Improve gmx help -export Sphinx rebuild behavior
Make 'gmx help -export rst' only touch those files whose contents have
actually changed. This means that rerunning Sphinx is much faster when,
e.g., checking the help text for a single tool after fixes.
Use the file output redirection mechanism to capture all output into
intermediate buffers, and only write it to a file if that file is
missing or has incorrect contents.
Update the reference data to avoid future confusion, as the files are
now written out in a different order (but the tests pass irrespective of
the order of the stuff in the reference data).
Change-Id: I41d764e16aa68a5aaa7da879f4b600e268ca70f2
Berk Hess [Thu, 2 Jul 2015 14:03:34 +0000 (16:03 +0200)]
Fix DD DLB state issue
The introduction of DLB locking for PME load balancing added another
DLB state, which was stored in a third variable. These variables
were not always all properly checked. Simplified the code by merging
these three state variables into one. In added there was a fourth
variable (bGridJump) is gmx_domdec_t, this is replaced by calls to
a functions returning is DLB is on.
Refs #1760.
Change-Id: I80d499149e4e5bfd689e76208384a8ba61e2842a
Berk Hess [Fri, 3 Jul 2015 20:28:52 +0000 (22:28 +0200)]
Fix bug in GPU list balancing
The function split_sci_entry could produce empty lists, which can
cause illegal memory access or incorrect energies. Before commit
6106367b this bug was never triggered, since nsp_max was never smaller
than a full cj4 entry. But
6106367b introduced a but that could
produce negative nsp_max.
Fixes #1767.
Change-Id: I2007cf6851f94f4f2ca62f609a0628725014dbe7
Teemu Murtola [Mon, 29 Jun 2015 10:02:11 +0000 (13:02 +0300)]
Fix error wrapping in interactive selection input
Resolve a TODO that was introduced when making the selection prompting
use text streams about wrapping error messages like they used to be
wrapped.
The use of console width within the generic error formatting code is a
bit ugly, but that needs to be resolved separately (and the old code
also had the console width hardcoded in a similar level).
Change-Id: Ia832a4884eef9bded50431fb5008d8191f52c246
Teemu Murtola [Sat, 27 Jun 2015 17:39:33 +0000 (20:39 +0300)]
Tests for interactive selection input
Add InteractiveTestHelper to test interactive sessions that use streams
from textstream.h. Use this to test the behavior of interactive
selection input. Errors in the input or the help is not tested (yet),
and some corner cases may be missing, but most of the code is now
covered. Separate set of tests is needed for SelectionOptionManager
prompts (will be added separately).
Remove obsolete special case handling from the interactive parser
(verified to not change any of the now tested behavior).
Change-Id: I448d470bcc240659a380ffa2d3b492949420c64b
Teemu Murtola [Sat, 4 Jul 2015 17:45:11 +0000 (20:45 +0300)]
Fix copy-paste bug in gmx distance
The -oxyz option did not behave properly (the computed values should be
fine, but the behavior of where the output goes can be unpredictable).
Change-Id: Idcd389c3809189f85a630094b9aaea6d61a5f954
Teemu Murtola [Mon, 29 Jun 2015 04:58:04 +0000 (07:58 +0300)]
Refactor for testing interactive selection input
Introduce a parseInteractive() method that works otherwise as
parseFromStdin(), except that the caller can provide alternative
streams for input and status output. Make all input and output use
these streams instead of stdin or stderr directly. Unit tests using
this will be added separately.
There is slight regression in the formatting of errors during
interactive selection input (noted in TODO in exceptions.cpp), but that
will be resolved separately to keep the commits smaller.
Change-Id: I0b6b885621c81a5e526b5f93b40d32b9248626f1
Teemu Murtola [Fri, 26 Jun 2015 04:16:48 +0000 (07:16 +0300)]
Remove gmx::File (except for File::exists())
Move remaining static methods to either path.h, or to the file stream
classes.
Change-Id: I53e910051b9ef57e501adad4cec1a4b5295d24a7
Teemu Murtola [Fri, 26 Jun 2015 03:52:03 +0000 (06:52 +0300)]
Replace direct uses of gmx::File with streams
Replace all cases that used gmx::File with a stream-based
implementation. Only static methods in gmx::File are called from
outside file.cpp (will be removed separately).
This is only direct replacement to remove uses of gmx::File; some
additional refactoring is necessary to support alternative streams for
unit testing.
Change-Id: I5e5905f9f8956909f3396359b18e9660f100bcaa
Teemu Murtola [Sat, 4 Jul 2015 03:48:44 +0000 (06:48 +0300)]
Bump various version numbers
It is unclear what will the next version be numbered, but bumped the
version numbers nonetheless to make the distinction between master and
release-5-1 clear.
Change-Id: I1cfe9f2a2523056f20cc825025425fc9429c89b6
Teemu Murtola [Sun, 28 Jun 2015 19:21:52 +0000 (22:21 +0300)]
Make some unit tests use mock file output
Introduce TextOutputStream interface that can be used instead of direct
file output. Make unit tests for help output use an in-memory stream
instead of a real file using this mechanism.
TextWriter wraps the raw stream and provides more convenient
line-oriented writing capabilities (and provides a natural place to
implement more of the same, making them immediately available
independent of the stream used).
Most of the touched files only contain mechanical
File -> TextWriter/TextOutputStream
and related documentation replacements. The main non-trivial changes
are introduction of the streams and TextWriter in utility/, and
reorganization of the way the tests use TestOutputRedirector.
Further work follows to allow full removal of the old File class and to
extend the use of the streams.
Change-Id: I7700d0d8af5f44b304b940797a4834993e4738fb
Berk Hess [Thu, 2 Jul 2015 08:09:43 +0000 (10:09 +0200)]
Obey OpenMP thread count limit with tMPI
With thread-MPI mdrun would choose the number of OpenMP threads so
that the maximal number of hardware threads was used. When the number
of ranks was limited by the system size, this led to too high OpenMP
thread counts which lowered the efficiency. Now a limit is imposed.
Also updated some comments and renamed constants and bNTOptSet.
Change-Id: I830b5a3f2fd28f87acfbcf982103b62fc3e45758
Berk Hess [Wed, 1 Jul 2015 13:04:13 +0000 (15:04 +0200)]
Fix two PME DLB trigger issues
Dynamic load balancing got triggered while locked by PME load
balancing, because a check was placed incorrectly.
PME load balancing would never trigger with separate PME ranks
because a comparison was inverted.
Fixes #1760.
Fixes #1763.
Change-Id: I75eeb32423b864f84bfd45ecb61d169b473ed74a
Mark Abraham [Thu, 2 Jul 2015 14:57:10 +0000 (16:57 +0200)]
Fix ThreadMPI GPU assumptions
The OpenCL implementation introduces the constraint of one GPU per
node, but thread-MPI still assumed any compatible GPU was available
for use and thus should have a rank.
Consolidated the configure-time constants behind some API functions so
that we can use the same behaviour in the various setup code.
Added a warning message that the OpenCL implementation has to waste a
GPU, stopped showing another warning message related to wasting
GPUs when the OpenCL implemenation forces this, and improved
another message to clarify why gmx mdrun -ntmpi 2 won't work
with OpenCL.
Also fixed a few references to thread-MPI threads that are better
called thread-MPI ranks.
Change-Id: I4664c49786ebd26a53cbf5e1c26df79649ba4f5f
Berk Hess [Thu, 2 Jul 2015 12:05:26 +0000 (14:05 +0200)]
Correct grompp pull warning message
A warning full pull-coord?-groups referred to pull-coord?-geometry
instead. This is fixed by changing the order of proceses the pull
options, which better reflects the dependencies. Also reordered
the options in the mdp manual.
Change-Id: I6309d021282156cd3409af35bcfa38dc2cab1c67
Teemu Murtola [Thu, 2 Jul 2015 18:22:33 +0000 (21:22 +0300)]
Convert gmxfio to C++
- Remove unused variables.
- Add missing extern "C" in thread-MPI.
Change-Id: I595f60ee27ab851e8c7c304b1f02a8f4aa15932f
Szilard Pall [Wed, 1 Jul 2015 14:49:09 +0000 (16:49 +0200)]
Fix OpenCL compilation errors.
Fixes a typo in a structure. Also fixes an incorrect
variable name only visible on OS X.
Fixes #1765
Change-Id: I0ee0f61da1f036163aa85f719ef9ceb0dab06868
Teemu Murtola [Wed, 1 Jul 2015 03:21:41 +0000 (06:21 +0300)]
Remove status messages about Sphinx detection
Make FindSphinx.cmake and FindPythonModule.cmake respect the QUIET
option, and pass that to find_package() to not print out information on
every CMake run. Most people will not care whether these are found or
not, and being silent in all cases is the same approach as is used for
Doxygen.
In master, it could be useful to change at least some of the documentation
build rules such that they require GMX_DEVELOPER_BUILD to be set, and
that could also enable messages about not finding the components needed
for the documentation build, but that is outside the scope of this
change.
Fixes part of #1761 and #1764.
Change-Id: I196f5e66c94fe4247ae28bd230a469acbaad939a
Teemu Murtola [Thu, 2 Jul 2015 04:15:41 +0000 (07:15 +0300)]
Clean up gmxfio includes
- Move xdr writing routines to gmxfio-xdr.h, to make it match the source
file that implements them. Include the new header where necessary.
- Remove gmx_fio_checktype() and replace with asserts in the xdr
routines.
- Remove unnecessary gmxfio.h includes. In particular, remove it from
headers to not get it transitively to unrelated source files.
- Fix compilation that was broken by relying on transitively included
headers (mostly cstringutil.h and futil.h) that no longer come through
the transitive gmxfio.h.
- Some minor cleanup elsewhere, in particular hiding functions from
headers and fixing a few uses of mismatching file open/close
functions.
Change-Id: Ic5366a23a421cfec82a518caee772e2bb53e8303
Teemu Murtola [Tue, 30 Jun 2015 10:51:10 +0000 (13:51 +0300)]
Reorder code within gmxfio
Move all XDR writing code, including helper functions only used by it,
into gmxfio_xdr.c and rename it into gmxfio-xdr.c. Remove these from
gmxfio-impl.h and make them static. No changes to the actual code,
except for some handling of the itmp variables in gmx_bool functions to
satisfy gcc maybe-uninitialized warnings.
Change-Id: If2190f9429585ac0e2ee86b24dd41969f4ae8d49
Teemu Murtola [Tue, 30 Jun 2015 10:39:40 +0000 (13:39 +0300)]
Further gmxfio cleanup related to xdr handling
- Remove unnecessary iotype redirection, since only XDR is used for
this. This removes a lot of repetitive code.
- Do not duplicate the list of XDR file types from filenm.c.
- Do not use string comparison for figuring out whether a file should be
opened as text or binary.
Change-Id: I4522ce36b828283f02c56bb7920292019ae154fa
Teemu Murtola [Wed, 1 Jul 2015 03:36:49 +0000 (06:36 +0300)]
Minor include sorting changes
Make the include sorter sort files based on the basename of the file,
i.e., not including the extension into the string sort. This fixes an
unintuitive behavior that causes changes in the include order if a
'-' <-> '_' replacement is done for a file name (because '-' and '_' are
on different sides of '.' in ASCII).
Change-Id: Icd4bf58b0d60178e33f6840f24adb1e4108fb92a
Teemu Murtola [Tue, 30 Jun 2015 04:09:58 +0000 (07:09 +0300)]
Clean up unused code from gmxfio
- gmx_fio_do_*() are only used for writing xdr-based formats, so remove
other implementations (has been unused since support for .tpa and .tpb
formats was removed). Further decoupling should be possible, but
left for later to keep this change smaller.
- gmx_fio_read/write_*() were not used with the exception of
gmx_fio_write_string(), so removed. gmx_fio_do_*() should be
sufficient.
- Remove USE_XDR, as it is not feasible to turn off xdr support from
these few files alone. There are several file formats (most notably,
run input files and checkpoints) that can only be written in xdr, and
functionality is severely limited without those.
- Remove unused (and mistyped) HAVE_XML #ifdefs. It's not feasible to
inject XML support at this level...
- Move confusingly placed check for not opening .tng files with gmxfio.
- Removed unnecessary checks for debug files (debug files are no longer
opened with gmx_fio_fopen()).
- Remove a few other fields, enum members, and methods that had no effect.
- Rename gmxfio_int.h to gmxfio-impl.h to make its role clearer and to
make it follow the current naming conventions.
- Remove unnecessary #includes.
Change-Id: Ida39a076969da116c9a5f68d7bfd0355c2d2363b
Berk Hess [Tue, 30 Jun 2015 19:22:01 +0000 (21:22 +0200)]
Fix inconsistent OpenMP automation
The thread MPI single rank max thread count for non-Intel was 6,
this was smaller than the max allowed MPI+OpenMP thread count of 8,
which caused setups to be generated that did not pass the check.
Increased 6 to 8 and added an assertion.
Change-Id: I13787616d7c667cba3245da4f5b5c3a1a6a1206d
Mark Abraham [Mon, 20 Apr 2015 22:13:17 +0000 (00:13 +0200)]
Document how to add and use NVML support
Change-Id: I8ca7c5d1b163a78559a048ca6cc5b099f34c6cd6
Mark Abraham [Mon, 29 Jun 2015 22:32:52 +0000 (00:32 +0200)]
Avoid GPU data race also with OpenCL
Implements the same change to non-local stream synchronization as now
used for CUDA.
Fixes #1756
Change-Id: I720edc0951f97dcff0bd477084fff45a149f01d9
Berk Hess [Tue, 5 Aug 2014 10:07:36 +0000 (12:07 +0200)]
enabled 1 PP + 1 PME node
Change-Id: I18a4c2bac71f1b5b81d9d374b212bfb9edc7a1e8
Berk Hess [Fri, 24 Apr 2015 13:27:27 +0000 (15:27 +0200)]
Add checks for inefficient resource usage
Checks have been added for using too many OpenMP threads and when
using GPUs for using single OpenMP thread. A fatal error is generated
in case where we are quite sure performance is very sub-optimal. This
is nasty, but a fatal error is the only way to ensure that users don't
ignore this warning. The fatal error can be circumvented by explicitly
setting -ntomp, in that case a note is printed to log and stderr.
Now also avoids ranks counts with thread-MPI that don't fit with the
total number of threads requested.
With a GPU without DD thread count limit is now doubled.
Disabled GPU sharing with OpenCL.
Change-Id: Ib2d892dbac3d5716246fbfdb2e8f246cdc169787
Mark Abraham [Wed, 25 Mar 2015 13:41:40 +0000 (14:41 +0100)]
Add support for flushing WDDM queue
Relevant only with CUDA on Windows (and profiling?)
On Windows the WDDM driver (default for non-Tesla) can prevent
immediate submission of CUDA tasks to the GPU in an attempt
to try to amortize driver overheads. However, as we need
tasks to start immediately for optimal concurrent execution,
this "feature" will result in large overheads. A well-
documented workaround is implemented by this change.
Change-Id: I69a6bb59dc8cae18fba539de49c977c0ee814d07
Mark Abraham [Mon, 29 Jun 2015 20:25:11 +0000 (22:25 +0200)]
Merge "Merge branch release-5-0"
Erik Lindahl [Sat, 27 Jun 2015 12:16:53 +0000 (14:16 +0200)]
Fix OS X openCL builds
OS X does not like the quotes previously used to handle
OpenCL include paths with spaces - escape them instead.
With this change, OpenCL works at least on Yosemite
(OS X 10.10) using a GeForce GT 750M card, and passes
all Gromacs regression tests.
Change-Id: I2acd30256e2ff11ca1fde10361cc0cc55ee7fc05
Erik Lindahl [Thu, 25 Jun 2015 15:46:05 +0000 (11:46 -0400)]
Fix bugs in gmx dos
- Velocity autocorrelations were not normalized
by default, so they did not agree with gmx velacc.
- The normalize option had no effect on the VACs.
- The index group option was available, but no
index groups were processed.
- Since the DoS is calculated from the mass-weighted
VAC and by default only from the real part, it was
not clear why these results would differ from data
obtained with gmx velacc. There is at least a note
about this now, and more docs will be added in the
future.
- The hidden option to dump some plots has been
removed since it was not documented what these
contained (beyond a paper reference), and the
contents was not based on any data from the
trajectory, but rather plotting a custom function.
Fixes #1608.
Change-Id: Icfca060f94efb34bd7871bd90245ab0ddbbe91c1
Erik Lindahl [Thu, 25 Jun 2015 19:01:26 +0000 (15:01 -0400)]
Replace functions deprecated in OpenCL 1.2
Check for CL_VERSION_1_2 in the source, and
use newer versions in that case to avoid
warnings about deprecated functions.
Change-Id: I6f70e0178fa06c59be57168d94aae0fd7df148f5
Teemu Murtola [Mon, 29 Jun 2015 17:48:33 +0000 (20:48 +0300)]
Only accept exact matches for selection keywords
The selection parser tried to be nice to the user and also accept
unambiguous prefixes of keywords, but this also has a lot of side
effects that can be confusing (e.g., it was impossible to create
variables that had names that were prefixes to keywords or to other
variable names). Additionally, this was the only case where user input
could cause an exception during tokenization of the input string, and
that wasn't handled very well during interactive input, either (it
caused the whole program to stop, instead of just reporting the error
like is done for other parsing errors).
Remove the logic, and only make the parser accept exact matches.
Add a few synonyms for keywords where there is a natural abbreviation.
Change-Id: I6041baa2f5a3b7dab87c3d5991e883d2d74ace66
Mark Abraham [Mon, 29 Jun 2015 18:52:12 +0000 (20:52 +0200)]
Merge branch release-5-0
Conflicts:
src/gromacs/gmxpreprocess/hackblock.c
Used new name for header file for gmx_warning.
src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu
Moved code to the other side of the sync point as
in release-5-0. Renamed cu_nb to nb.
src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda_data_mgmt.cu
Changed name of event to destroy. Renamed cu_nb to nb.
Change-Id: Iee9e2ea372ee704057a4a51ad9e4ab9a22ab7fe6
Mark Abraham [Mon, 29 Jun 2015 11:04:12 +0000 (13:04 +0200)]
Documented build types in developer guide
Change-Id: I9212c0399b59c69845f20322b7d8ace3de0b61c5
Mark Abraham [Mon, 29 Jun 2015 14:43:51 +0000 (16:43 +0200)]
Merge release-4-6 into release-5-0
Conflicts:
src/gmxlib/nonbonded/nb_free_energy.c
Deleted - change already exists in release-5-0
src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu
Applied same reordering to the (unchanged) release-5-0 code
src/mdlib/nbnxn_cuda/nbnxn_cuda_types.h
Renamed misc_ops_and_local_H2D_done, and added Doxygen for it
Change-Id: I4c34d168af347a59b7821da6fea71a4715ec5bae
anca [Sat, 10 Jan 2015 21:41:39 +0000 (23:41 +0200)]
Implement OpenCL support
StreamComputing (http://www.streamcomputing.eu) has implemented the
short-ranged non-bonded interaction accleration features previously
accelerated with CUDA using OpenCL 1.1. Supported devices include
GCN-based AMD GPUs and NVIDIA GPUs.
Compilation requires an OpenCL SDK installed. This is included in
the CUDA SDK in that case.
The overall project is not complete, but Gromacs runs correctly on
supported devices. It only runs fast on AMD devices, because of a
limitation in the Nvidia driver. A list of known TODO items can be
found in docs/OpenCLTODOList.txt. Only devices with a warp/wavefront
size that is a multiple of 32 are compatible with the implementation.
Known issues include that tabulated Ewald kernels do not work (but the
analytical kernels are on by default, as with CUDA), and the blocking
behaviour of clEnqueue in Nvidia drivers means no overlap of CPU and
GPU computation occurs. Concerns about concurrency correctness with
context management, JIT compilation, and JIT caching means several
features are disabled for now. FastGen is enabled by default, so the
JIT compilation will only compile kernels needed for the current
simulation.
There is some duplication between the two GPU implementations, but
the active development expected for both of them suggests it is
not worthwhile consolidating the implementations more closely.
Change-Id: Ideaf16929028eb60e785feb8298c08e917394d0f
Mark Abraham [Wed, 9 Jul 2014 13:50:17 +0000 (15:50 +0200)]
Add MSAN build type
This permits GROMACS to build with Memory Sanitizer in clang >= 3.4.
Refactored the tests that linking to libxml2 and zlib work, which is
simpler to follow now that there are three paths. The new MSAN path is
useful because making try_compile tests work with MSAN is tricky, and
only useful for very few developers.
Fixed a missing header exposed by using a different C++ library.
Documented use in the deveoper guide
Change-Id: Ia3e8077ac732386563eebfa54f2f7d71ebd74a33
Erik Lindahl [Tue, 23 Jun 2015 20:20:00 +0000 (16:20 -0400)]
Fix bug removing multiple dihedrals in main rtp entries
The Gromacs-5.0 series has had a serious bug where pdb2gmx
would only consider the first entry when several explicit
bonds were listed for the same atoms in an RTP entry. Older
topologies have worked fine.
Fixes #1704, #1755.
Change-Id: I0b34aeb905dab8ea66196cabc0745583ef6d7209
Mark Abraham [Thu, 25 Jun 2015 20:46:51 +0000 (22:46 +0200)]
Fix all use of nbnxn_simd.h
Added nbnxn_simd.h to the set of SIMD headers for which we do
per-commit testing for correct use.
Fixed a bunch of files that were acquiring the dependency
transitively, which is error-prone.
Change-Id: Ic8fb462c7790a5e723f901700dd06fd09543e58a
Teemu Murtola [Wed, 24 Jun 2015 09:16:54 +0000 (12:16 +0300)]
Make some unit tests skip file system access
Make unit tests for FileNameOptionManager not use the file system.
Introduce a FileInputRedirectorInterface to support mocking file
existence checks, and use a mock implementation in the tests.
These particular tests did quite a bit of file system access, and the
speedup is only a few ms, although significant percentually (something
like 80%). But there can be tests where this has more effect, and this
approach provides a starting point for more work on eliminating
unnecessary file system access from the tests.
The main benefit is clearer and more robust test code, as it is no
longer necessary to construct actual files and ensure that they do not
conflict with other tests or cause issues if the test crashes or such.
Change-Id: Ib9a171331e988fa7e74b16078164f477f8296c6e
Erik Lindahl [Sun, 28 Jun 2015 12:47:08 +0000 (14:47 +0200)]
Removed gmx protonate tool
This tool appears to have been largely unused, since
testing shows it crashes for a normal trajectory all
the way back to 4.6. Since it is only relevant for
united-atom force fields, we'll reduce the maintenance
load by simply removing it for now - it might reappear
in the future.
Refs #1618.
Change-Id: If57e250f0ffbe32bcc948d09b54b225db9724c35
Berk Hess [Mon, 4 Aug 2014 15:51:03 +0000 (17:51 +0200)]
Improve DLB+PME tuning with GPUs
With GPUs and the DD DLB can quickly limit the PME load balancing
room too much. In such cases (and only with DLB=auto) we now first
do PME load balancing without DLB and then, if DLB gets turned on,
a second round of PME load balancing.
Also fixed that when DLB limited the tuning, the fastest choice was
reset, which would often lead to stronger limitations.
Change-Id: I0087e6b8512d5574d8d0fa2db82e6e38279a82f1
Szilard Pall [Mon, 29 Jun 2015 01:58:35 +0000 (03:58 +0200)]
Fix CUDA inter-stream synchronization issue
With the introduction of multiple hardware queues in CC 3.5 and later
NVIDIA GPUs, the implicit dependency between tasks in the local and
non-local kernel got eliminated. However, as the misc_ops_done event
that we sync with in the non-local stream preceded the local coordinate
transfer, even though the tasks in the local stream are always issued
first, under rare circumstances the non-local kernel could start before
the local coordinate transfer completes. This would lead to non-local
interactions being calculated using coordinates (and charges) from the
previous step.
This change moves the synchronization point to creating a dependency
between the local coordinate transfer and non-local non-bonded kernel.
Change-Id: I0b3837d46db6469f6b1d9869a3a73b5176d93d99
Erik Lindahl [Fri, 5 Jun 2015 06:18:16 +0000 (07:18 +0100)]
Fix double precision reference SIMD and gcc bug
The double precision logical operations on floats were
incorrect for the reference SIMD implementation for C
source, and GCC appears to be buggy with C++, likely
due to strict aliasing assumptions at -O3 interfering
with the required casts - solved by sticking to unions
for now. This might be slower, but the reference
implementation is not used for production anyway.
Change-Id: If048bda298618ae67968861c4a850d080c8cce31
Erik Lindahl [Fri, 12 Jun 2015 20:17:47 +0000 (22:17 +0200)]
Fix one error and compiler warnings with Cuda & clang-3.6
Clang-3.6 on OS X can now be used by nvcc. clang found one
error related to || being used instead of | to set flag bits,
and a handful of warnings variables in headers not being used.
The latter is caused by declaring constants in headers, and
making then static to avoid clashing symbols. However, this emits
them in every single compile unit that includes the header. Fixed
by either moving names to a cpp file, or changing to defines.
Change-Id: Ib4d59c40aa8caffc667cc202a3efe45891b2abe3
Peter Kasson [Mon, 22 Jun 2015 21:32:26 +0000 (17:32 -0400)]
Append _pullx_ and _pullf to pull files when -deffnm used.
Changes -deffnm behavior for pulling so that the pullx and pullf
files don't collide. Previously, this resulted in one being
backed up and checkpoint restarts failing when -deffnm was used.
(Technically this applies to anything where -px and -pf are identical
and not explicitly set, but that only happens with -deffnm.)
Additionally return fatal error if -px or -pf set and output files
collide.
Fix is now localized to the pull code.
Fixes #942 except for log file collision with pull-rotation.
Change-Id: I27b8b4ced0f307905e2c2ea4fb260376dd25dc32
Roland Schulz [Sat, 27 Jun 2015 23:20:15 +0000 (19:20 -0400)]
Remove unused cmake files
Change-Id: Id1a132e316539243dafa06a17ae2e8f69dc9f448
Roland Schulz [Sat, 27 Jun 2015 22:23:50 +0000 (18:23 -0400)]
Don't use check_library for libm
Using check_library doesn't work with build-ins like sqrt
with Werror. sqrt was anyhow only a placeholder for a standard
libm function. We really only need to know whether the library
exists.
Fixes #1750
Change-Id: I6a550cf8c8b8ea985b28130a4339935fb8c9741a
Berk Hess [Wed, 10 Jun 2015 08:07:35 +0000 (10:07 +0200)]
Improve pair search thread load balance
With very small systems and many OpenMP threads, especially when
using GPUs, some threads can end up without pair search work. Better
load balancing reduces the pair search time. Also the CPU non-bonded
kernel time is slightly reduced in the extreme parallelization limit.
Change-Id: Ib036ea3ba59f497eeee7afa73a71fb0e0ccd216e
Berk Hess [Mon, 3 Nov 2014 16:25:51 +0000 (17:25 +0100)]
Improved the intra-GPU load balancing
The splitting of the pair list to improve load balancing on the GPU
was based on the number of generated lists. But this number can be
high(er) due to small lists before splitting. This lead to too few
lists for small systems and too many for large systems.
Now the splitting is based on the number of pairs in the list up till
now. This produces much more stable results.
Because of the more stable results, we increased the min_ci_balanced
factor from 40 to 44 (closer to the ideal 48).
With small systems on many threads we used to generate many more lists
than targeted. Because the algorithm is now far better, we increased
the minimum list size from 32 to 36 and still get fewer lists.
Change-Id: Id2210171a409ef1a27f7dc919fe806f0fe4d869c
Szilard Pall [Wed, 24 Jun 2015 21:19:46 +0000 (23:19 +0200)]
Fix CUDA architecture dependent issues
Only device code gets generated in multiple passes and therefore
target architecture-dependent macros like __CUDA_ARCH__ or our own
IATYPE_SHMEM (which also depends on __CUDA_ARCH__) are not usable in
host code as these will be both undefined. As a result, current code
over-allocated dynamic shared memory. This has no negative side-effect.
This change replaces the use of macros with runtime device compute
capability checks. Also texture objects are now actually enabled,
which give very minor performance improvements.
Note that on Maxwell + CUDA 7.0 there is a 20% performance regression
for the tabulated Ewald kernel (which is not used by default), which
magically disappears when texture references are used instead.
Change-Id: I1f911caad85eb38d6a8e95f3b3923561dbfccd0e
Erik Lindahl [Fri, 12 Jun 2015 18:49:29 +0000 (20:49 +0200)]
Fix IBM VSX SIMD compiles with xlc
Remove most of the previous inline asm to improve
performance (the optimizer works better w/o asm),
and make sure the VSX SIMD code compiles with XLC.
Change-Id: I3e8e9b4dd6102dd5503210e3b49b844ee5492342
Berk Hess [Mon, 15 Jun 2015 15:33:54 +0000 (17:33 +0200)]
Reduce the use of nbnxn_consts.h
Removed reference to "half-width SIMD" (got reworked a long time ago)
Change-Id: Ifcf4f8ae7d18ac14dae7c843ffe1b5787a3dbff3
Berk Hess [Mon, 25 May 2015 13:54:19 +0000 (15:54 +0200)]
Corrected nstcalclr/nstfep/nst_repl_ex checks
Several checks with twin-range electrostatics still checked for
multiples of nstlist instead of nstcalclr and nst_repl_ex was not
checked at all.
Free-energy with small nstdhdl triggered frequent, unnecessary
neighbor search. It seems this can't have caused isssues with
free-energy, because nstfep needs to be a multiple of nstcalcenergy
which again needs to be a multiple of nstcalclr (was nstlist).
Change-Id: I20b8e8ab8329a3aaf904ddfa21573aee8d58c0e4
Erik Lindahl [Sat, 27 Jun 2015 16:11:02 +0000 (18:11 +0200)]
Fix single-prec. softcore issue in 4.6 branch
This is a backport of
412e2974 which we likely
forgot to put in 4.6. It primarily fixes
numerical issues in single precision for
sc_power==48, but it can also improve stability
of lower-power softcore free energy.
This should not be merged into 5.0 or master.
Refs #1580. Fixes #1306.
Change-Id: I5506d714214f0eb7e0517a0eda8d74cdba816dab
Erik Lindahl [Sat, 27 Jun 2015 10:58:34 +0000 (12:58 +0200)]
Merge "Merge release-5-0 into master"
Teemu Murtola [Tue, 23 Jun 2015 12:20:29 +0000 (15:20 +0300)]
Document tool changes in 5.1
Add documentation for tool changes done in 5.1, and improve the
documentation for some of the previous changes.
Change-Id: If7b194392f4516d9a98de2adcbbd436a9acebd2d
Mark Abraham [Fri, 26 Jun 2015 12:53:15 +0000 (14:53 +0200)]
Merge release-5-0 into master
Conflicts:
src/gromacs/mdlib/calc_verletbuf.c
Included nbnxn_simd.h, consistent with fix incoming from release-5-0.
src/programs/mdrun/runner.cpp
One of the changes in release-5-0 was in the -testverlet feature,
which has been removed in master. Other changes applied normally.
Change-Id: Id5b507bce1d6f907e97ac694f02fbe7e486f6208
Erik Lindahl [Fri, 8 May 2015 07:42:11 +0000 (09:42 +0200)]
Fix compile issues on K computer
Fix a compiler bug in the latest K environment (1.2-17)
related to a bug in the system heaers that we work around
by temporarily altering the ISOC99 define.
Change-Id: Ifd60c7062880ecc81bc704a0695c54c2cb342776
Berk Hess [Fri, 26 Jun 2015 07:21:22 +0000 (09:21 +0200)]
Fixed Verlet buffer issue with 2-wide SIMD
The Verlet buffer size for CPUs was always calculated for 4x4.
With 2-wide SIMD the estimate should be for 4x2, which results
in a slighly larger list buffer.
grompp now always sets rlist for a 4x4 list setup; mdrun anyhow
redetermines rlist at run time (added a note for this in grompp).
Fixes #1757.
Change-Id: If4bf9ad17b82b22c9d9f7c1dd3f88e66f2314df4
Mark Abraham [Fri, 26 Jun 2015 07:06:01 +0000 (09:06 +0200)]
Merge release-4-6 into release-5-0
Conflicts:
src/gromacs/fileio/strdb.c
src/gromacs/utility/cstringutil.c
These were not actually conflicts; the code changed here was simply
relocated in release-5-0.
src/programs/mdrun/md.c
Trajectory-writing code has moved in release-5-0, so applied some
changes to src/gromacs/fileio/trajectory_writing.c
Uncrustified, copyright bumps.
Change-Id: I11626cbf2a3efa89adf7c975df2a43b4eb3680fb
Erik Lindahl [Mon, 22 Jun 2015 12:14:15 +0000 (08:14 -0400)]
Fix buffer size issues in string database
We have cool quotes of the than 255 characters which was the buffer
size of buffers in the string database code. Now the buffers are
extended to STRLEN and an fscanf is replaced by fgets in an
infrequently used routine.
Change-Id: I6cd09e6e33f9a4ff3302fb2e1ab8235e2253e87c
Mark Abraham [Wed, 24 Jun 2015 14:42:47 +0000 (16:42 +0200)]
Fix mdrun -confout sometimes affecting final .edr frame
This makes a two-part run write .edr files that can be concatenated to
be identical to that from a one-part run. Otherwise, a single-domain
run might make molecules whole, do an update with the modified x
vector, and write a slightly different final .edr frame, even though
the .trr and .cpt (and thus the restart) were still fine.
Allocating an array of natoms rvecs isn't great, but we only need it
at the end of runs in cases where we ran a single domain, so this is
not a scaling bottleneck. Maybe the "optimization" of setting state
equal to state_global in this case is not such a good thing.
Change-Id: I4d1b49e7de0a5c1691084a612b46a10820784307
Alexey Shvetsov [Thu, 25 Jun 2015 22:07:08 +0000 (01:07 +0300)]
Fix recent compilation issue in pullcode
Undeclared variable names were used in a code path without
MPI_IN_PLACE_EXISTS. Introduced recently in
581ebdfd.
Change-Id: Ib124b7083a74bb4a3cb3e9971d8975ccf7eb0889
Erik Lindahl [Mon, 22 Jun 2015 15:48:15 +0000 (11:48 -0400)]
Fix edr appending and exact continuation
The nsteps field was not written to checkpoint files when
nstcalcenergy=nstenergy. This caused differences in nsteps
in appended energy files, which in turn caused issues in averages
and RMSD in g_energy (which is now fixed by another patch).
Also added the nsteps_sim field to the checkpoint file for
consistency.
Fixes #1342.
Change-Id: Iff8bf51aaa307a379f0e7cdb7d76d9bafb13cf13
Berk Hess [Wed, 21 Aug 2013 15:30:36 +0000 (17:30 +0200)]
Intermolecular bonded interaction support added
The .top file can now contain an [ intermolecular_interactions ]
directive, after which bonded interactions can be entered using
global atom indices.
Added a molecule defition section to the topologies chapter in the
manual. A description of the moleculetype directive was missing.
This section has a subsection on intermolecular interactions.
Change-Id: I383287dd0729fef1c54f27a4fe9f5d628445549c
Erik Lindahl [Mon, 22 Jun 2015 12:18:53 +0000 (08:18 -0400)]
Fix binary exact continuation for trajectories
The initial call to compute_globals() after
continuation would remove COM motion, which meant
trajectories would not be exactly (binary)
identical to a single trajectory - it was likely
introduced with the velocity verlet code.
Refs #1342.
Change-Id: I061648774c7625810a8b693c631c61ae377ba297
Szilard Pall [Wed, 24 Jun 2015 21:16:49 +0000 (23:16 +0200)]
Add missing macro undef in CUDA NB kernel
Harmless as different architecture code-path get generated such that
they don't end up in the same compilation unit, so the two macros
remaining defined did not affect code where the __CUDA_ARCH__ >= 300
is not true.
Change-Id: Ic6911e5c13781ac8a2835c3aef1457df6da60412
Roland Schulz [Fri, 5 Jun 2015 00:34:38 +0000 (20:34 -0400)]
Fix Wundef warnings
Also fixes a performance bug in gmx_simd_invsqrt_pair_d. Previuosly it
did unnecessary number of iterations because it used an non-existing
preprocessor variable.
Change-Id: Idcdf3872b5a169e8690721bbe83922a4ab280da8
Teemu Murtola [Tue, 23 Jun 2015 06:49:40 +0000 (09:49 +0300)]
Move documentation on tool changes from wiki
Move the text on tool changes from the wiki page to the user guide,
where it is much easier to maintain together with the code changes.
For now, this only contains the 5.0-era changes more or less directly
from the wiki, but I will improve this (as a separate change) once I
have more time.
Change-Id: I4ccea605783de2949221d604e232b36c726dcaa7
Mark Abraham [Thu, 25 Jun 2015 16:57:44 +0000 (18:57 +0200)]
Merge "Merge release-5-0 into master"
Berk Hess [Wed, 4 Mar 2015 21:46:00 +0000 (22:46 +0100)]
Reenabled SIMD kernels without cut-offs
In 4.6 there was a bug that caused incorrect results with group scheme
SIMD kernels with systems without cut-offs. This seems to have been
resolved already in 5.0.
Refs #1249.
Change-Id: I86c2b2ae097769472c1748e6e03a673a58a938eb
Mark Abraham [Thu, 25 Jun 2015 08:47:53 +0000 (10:47 +0200)]
Merge release-5-0 into master
Conflicts:
docs/install-guide/install-guide.md
Applied release-5-0 changes to docs/install-guide/index.rst
docs/manual/install.tex
Applied release-5-0 changes to docs/user-guide/environment-variables.rst
src/gromacs/commandline/cmdlineprogramcontext.cpp
Applied comment from release-5-0
src/gromacs/gmxana/gmx_hbond.c
Removed smooth_tail as in release-5-0
src/gromacs/gmxlib/gmx_detect_hardware.cpp
Removed GMX_MAX_THREADS implementation as in release-5-0
(adjacent changes)
src/gromacs/utility/init.cpp
Combined changes to include statements
src/gromacs/utility/path.cpp
Combined changes to include statements
src/programs/mdrun/mdrun.cpp
Added new please_cite calls from release-5-0 to new call site
in runner.cpp
Change-Id: I5b75b12bb8a49a606b086184017b754477d65068
Mark Abraham [Wed, 24 Jun 2015 21:33:24 +0000 (23:33 +0200)]
Merge release-4-6 into release-5-0
Change-Id: I6cc179f767cc4e0fc1dc3fc0b85428df986c16df
Erik Lindahl [Tue, 23 Jun 2015 20:43:33 +0000 (16:43 -0400)]
Avoid accessing hackblocks for x2top
Previous fixes for generating dihedrals would cause us to access
the hackblock even when using x2top, which we should not do.
Fixes #1711.
Change-Id: Id0ea6d524d80b87fe257c9fd4f1c48c4be828654
Magnus Lundborg [Fri, 27 Mar 2015 11:20:38 +0000 (12:20 +0100)]
Update header when appending to TNG file.
The library now handles updating the size of the headers
by moving the first frame sets to make room (if necessary).
Therefore this is now enabled when appending to a trajectory
from mdrun.
Change-Id: I7334a1b0c251856f30e116b63161f7676055dce1
Berk Hess [Mon, 15 Jun 2015 21:56:26 +0000 (23:56 +0200)]
GPU Ewald table kernel now uses CPU spacing
The tabulated Ewald GPU kernels (only used by default on Fermi),
used a fixed table spacing. Now the same adaptive spacing as on the
CPU is used. With default settings this leads to a coarser table.
Change-Id: I35f0c89e00301ac7e1260b1c0dcc1604300da5aa