GPU naming conventions

author Artem Zhmurov <zhmurov@gmail.com>

Tue, 19 Mar 2019 12:59:57 +0000 (13:59 +0100)

committer Mark Abraham <mark.j.abraham@gmail.com>

Thu, 28 Mar 2019 07:52:49 +0000 (08:52 +0100)
author Artem Zhmurov <zhmurov@gmail.com>
Tue, 19 Mar 2019 12:59:57 +0000 (13:59 +0100)
committer Mark Abraham <mark.j.abraham@gmail.com>
Thu, 28 Mar 2019 07:52:49 +0000 (08:52 +0100)
diff --git a/docs/dev-manual/naming.rst b/docs/dev-manual/naming.rst

index a9ba6d0ccfb3c31d5095c613c0b71016837473be..e69b3a9549ed4381de75e8a9fda674afc94974c4 100644 (file)
--- a/docs/dev-manual/naming.rst
+++ b/docs/dev-manual/naming.rst
@@ -109,7 +109,7 @@ C++ code
    in particular for settings exposed to other modules.
  * Prefer to use enumerated types and values instead of booleans as control
    parameters to functions. It is reasonably easy to understand what the
-  argument ``HelpOutputFormat_Console`` is controling, while it is almost
+  argument ``HelpOutputFormat_Console`` is controlling, while it is almost
    impossible to decipher ``TRUE`` in the same place without checking the
    documentation for the role of the parameter.
  
@@ -117,6 +117,42 @@ The rationale for the trailing underscore and the global/static prefixes is
  that it is immediately clear whether a variable referenced in a method is local
  to the function or has wider scope, improving the readability of the code.
  
+Code for GPUs
+-------------
+
+Rationale: on GPUs, using the right memory space is often performance critical.
+
+* In CUDA device code ``sm_``, ``gm_``, and ``cm_`` prefixes are used for
+  shared, global and constant memory. The absence of a prefix indicates
+  register space. Same prefixes are used in OpenCL code, where ``sm_``
+  indicates local memory and no prefixes are added to variables in private
+  address space.
+* Data transferred to and from host has to live in both CPU and GPU memory
+  spaces. Therefore it is typical to have a pointer or container (in CUDA), or
+  memory buffer (in OpenCL) in host memory that has a device-based counterpart.
+  To easily distinguish these, the variables names for such objects are
+  prefixed ``h_`` and ``d_`` and have identical names otherwise. Example:
+  ``h_masses``, and ``d_masses``.
+* In all other cases, pointers to host memory are not required to have the
+  prefix ``h_`` (even in parts of the host code, where both host and device
+  pointers are present). The device pointers should always have the prefix
+  ``d_`` or ``gm_``.
+* In case GPU kernel arguments are combined into a structure, it is preferred
+  that all device memory pointers within the structure have the prefix ``d_``
+  (i.e. ``kernelArgs.d_data`` is preferred to ``d_kernelArgs.data``,
+  whereas both ``d_kernelArgs.d_data`` and ``kernelArgs.data`` are not
+  acceptable).
+* Note that the same pointer can have the prefix ``d_`` in the host code,
+  and ``gm_`` in the device code. For example, if ``d_data`` is passed to
+  the kernel as an argument, it should be aliased to ``gm_data`` in the
+  kernel arguments list. In case a device pointer is a field of a passed
+  structure, it can be used directly or aliased to a pointer with ``gm_``
+  prefix (i.e. ``kernelArgs.d_data`` can be used as is or aliased to
+  ``gm_data`` inside the kernel).
+* Avoid using uninformative names for CUDA warp, thread, block indexes and
+  their OpenCL analogs (i.e. ``threadIndex`` is preferred to ``i`` or
+  ``atomIndex``).
+
  Unit tests
  ----------
author	Artem Zhmurov <zhmurov@gmail.com>
	Tue, 19 Mar 2019 12:59:57 +0000 (13:59 +0100)
committer	Mark Abraham <mark.j.abraham@gmail.com>
	Thu, 28 Mar 2019 07:52:49 +0000 (08:52 +0100)