12. ``IBM_VSX`` Power7, Power8, Power9 and later have this.
13. ``ARM_NEON`` 32-bit ARMv7 with NEON support.
14. ``ARM_NEON_ASIMD`` 64-bit ARMv8 and later.
+15. ``ARM_SVE`` 64-bit ARMv8 and later with the Scalable Vector Extensions (SVE).
+ The SVE vector length is fixed at CMake configure time. The default vector
+ length is 512 bits, and this can be changed via the ``GMX_SIMD_ARM_SVE_LENGTH``
+ CMake variable.
The CMake configure system will check that the compiler you have
chosen can target the architecture you have chosen. mdrun will check
configure with ``GMX_USE_RDTSCP=off``. Non-x86 platforms are
unaffected, except that they will no longer report that RDTSCP is
disabled (because that is self-evident).
+
+armv8+sve support (ARM_SVE)
+"""""""""""""""""""""""""""
+Support for ARM Scalable Vector Extensions (SVE) has been added.
+|Gromacs| supports SVE vector length fixed at CMake configure time
+(typically via the -msve-vector-bits=<len> compiler option),
+which is at the time of the release supported in GNU GCC 10 and later,
+and will supported soon by LLVM 12 and compilers based on this.
+The default vector length is 512 bits, and that can be changed at
+CMake configure time with ``GMX_SIMD_ARM_SVE_LENGTH=<bits>`` option.
+Supported values are 128, 256, 512 and 1024. Note that the nonbonded
+kernels have not been optimized for ARM_SVE as of yet.
+ARM_SVE support is contributed by the Research Organization for Science Information and Technology (RIST)
configured with ``GMX_SIMD=AVX2_256`` instead of ``GMX_SIMD=AVX512`` for better
performance in GPU accelerated or highly parallel MPI runs.
+Some latest ARM based CPU, such as A64fx, support the Scalable Vector Extensions (SVE).
+Though SVE can be used to generate fairly efficient Vector Length Agnostic (VLA) code,
+VLA is not a fit for GROMACS (that currently assumes the SIMD vector length is known at
+CMake time). Consequently, the SVE vector length must be fixed at CMake time. The default
+value is 512 bits, and this can be changed with ``GMX_SIMD_ARM_SVE_LENGTH=<len>``.
+The supported vector length are 128, 256, 512 and 1024. Since GROMACS optimized non-bonded kernels
+only support up to 16 floating point numbers per SIMD vector, 1024 bits vector length is only
+valid in double precision (e.g. ``-DGMX_DOUBLE=on``).
+Note that even if `mdrun` does check the SIMD vector length at runtime, running with a different
+vector length than the one used at CMake time is undefined behavior, and `mdrun` might crash before reaching
+the check (that would abort with a user-friendly error message).
+
Process(-or) level parallelization via OpenMP
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^