BlueGene/Q Verlet cut-off scheme kernels
authorMark Abraham <mark.j.abraham@gmail.com>
Mon, 19 Aug 2013 16:33:55 +0000 (18:33 +0200)
committerGerrit Code Review <gerrit@gerrit.gromacs.org>
Fri, 18 Oct 2013 07:39:59 +0000 (09:39 +0200)
commit25eb0e14db996febfe78195a2e63ee2874ae84f7
tree0cceec1924edec58131b3c2cc543b75dc9570146
parent2bd5f64f2a9021c872fc06f1689492fea283413b
BlueGene/Q Verlet cut-off scheme kernels

The kernels are implemented with small functions whose inlining
is guaranteed by the use of xlc and clang extensions. That's a hack
whose general solution I plan to implement in master branch.

Other BG/Q considerations:

Architecture detection now works on A2 core.

Install guide updated.

It is better to use intra-node communicators than not, and ranks
within nodes are correctly detected via querying the BlueGene/Q API,
since the hostname is not useful for the purpose.

It is better to not set GMX_DD_SENDRECV2.

It is better to use the analytical Ewald correction.

In principle, we should version the type of variables and fields named
d2, rl2, rbb2 in nbnxn_search*[ch] to be double on PowerPC and float
everywhere else (each regardless of GROMACS target precision). This
would mean that on PowerPC (where all flops take place in double
precision with free precision-extension upon load) we can be both
cache-efficient by storing bounding boxes in float, and flop-efficient
by not having to generate a round-to-single instruction to compare the
result of subc_bb_dist2_simd4 with the cut-off stored as a
float. Still, a flop per bounding-box distance comparison will not
break the bank.

Enough bgclang support exists for the build to succeed (no platform
file is required), even with OpenMP, but a number of compiler issues
have been reported on llvm-bgq-discuss mailing list.

Change-Id: I98c5791ec3766cdbdcb8a8eb7418d00585727cc0
39 files changed:
CMakeLists.txt
admin/installguide/installguide.tex
cmake/Platform/BlueGeneQ-base.cmake
cmake/TestBlueGeneQ.c [new file with mode: 0644]
cmake/TestX86.c [new file with mode: 0644]
cmake/gmxDetectAcceleration.cmake
cmake/gmxDetectTargetArchitecture.cmake [new file with mode: 0644]
cmake/gmxGetCompilerInfo.cmake
cmake/gmxManageBlueGene.cmake
cmake/gmxSetBuildInformation.cmake
cmake/gmxTestInlineASM.cmake
include/gmx_simd4_macros.h
include/gmx_simd_macros.h
include/gmx_simd_ref.h
include/network.h
include/types/nb_verlet.h
include/types/nbnxn_pairlist.h
src/config.h.cmakein
src/gmxlib/gmx_cpuid.c
src/gmxlib/network.c
src/kernel/runner.c
src/mdlib/forcerec.c
src/mdlib/nbnxn_atomdata.c
src/mdlib/nbnxn_cuda/nbnxn_cuda_data_mgmt.cu
src/mdlib/nbnxn_kernels/nbnxn_kernel_simd_utils.h
src/mdlib/nbnxn_kernels/nbnxn_kernel_simd_utils_ibm_qpx.h [new file with mode: 0644]
src/mdlib/nbnxn_kernels/nbnxn_kernel_simd_utils_ref.h
src/mdlib/nbnxn_kernels/nbnxn_kernel_simd_utils_x86_128d.h
src/mdlib/nbnxn_kernels/nbnxn_kernel_simd_utils_x86_128s.h
src/mdlib/nbnxn_kernels/nbnxn_kernel_simd_utils_x86_256d.h
src/mdlib/nbnxn_kernels/nbnxn_kernel_simd_utils_x86_256s.h
src/mdlib/nbnxn_kernels/simd_2xnn/nbnxn_kernel_simd_2xnn_common.h
src/mdlib/nbnxn_kernels/simd_2xnn/nbnxn_kernel_simd_2xnn_inner.h
src/mdlib/nbnxn_kernels/simd_4xn/nbnxn_kernel_simd_4xn_common.h
src/mdlib/nbnxn_kernels/simd_4xn/nbnxn_kernel_simd_4xn_inner.h
src/mdlib/nbnxn_kernels/simd_4xn/nbnxn_kernel_simd_4xn_outer.h
src/mdlib/nbnxn_search.c
src/mdlib/nbnxn_search.h
src/mdlib/nbnxn_search_simd_4xn.h