BioD PNPI Git Repos - alexxy/gromacs.git/commit

author	Szilárd Páll <pall.szilard@gmail.com>
	Fri, 18 Mar 2016 01:12:18 +0000 (02:12 +0100)
committer	Szilárd Páll <pall.szilard@gmail.com>
	Thu, 31 Mar 2016 00:05:24 +0000 (02:05 +0200)
commit	a512a9370548ab1d549462a7dc09a68d9d58b919
tree	6a9956682bb3ad8bee74c3c5e27eeba5f2887f27	tree \| snapshot
parent	74a72a9006248e0dce25c9a73cb6c74c7e18b769	commit \| diff

Fix multiple tMPI ranks per OpenCL device

The OpenCL context and program objects were stored in the gpu_info
struct which was assumed to be a constant per compute host and therefore
shared across the tMPI ranks. Hence, gpu_info was initialized once
and a single pointer pointing to the data used by all ranks.
This led to the OpenCL context and program objects of different ranks
sharing a single device get overwritten/corrupted by one another.

Notes:
- MPI still segfaults in clCreateContext() with multiple ranks per node
  both with and without GPU sharing, so no changes on that front.
- The AMD OpenCL runtime overhead with all hw threads used is quite
  significant; as a short-term solution we should consider avoiding
  using HT by launching less threads (and/or warning the user).

Refs #1804

Change-Id: I7c6c53a3e6a049ce727ae65ddf0978f436c04579

docs/user-guide/mdrun-performance.rst		diff \| blob \| history
src/gromacs/gpu_utils/cudautils.cuh		diff \| blob \| history
src/gromacs/gpu_utils/gpu_utils_ocl.cpp		diff \| blob \| history
src/gromacs/gpu_utils/oclutils.h		diff \| blob \| history
src/gromacs/hardware/detecthardware.cpp		diff \| blob \| history
src/gromacs/mdlib/forcerec.cpp		diff \| blob \| history
src/gromacs/mdlib/nbnxn_ocl/nbnxn_ocl.cpp		diff \| blob \| history
src/gromacs/mdlib/nbnxn_ocl/nbnxn_ocl_data_mgmt.cpp		diff \| blob \| history
src/gromacs/mdlib/nbnxn_ocl/nbnxn_ocl_jit_support.cpp		diff \| blob \| history
src/gromacs/mdlib/nbnxn_ocl/nbnxn_ocl_types.h		diff \| blob \| history