Fixed GPU non-local F copy local conditional
With domain decomposition and GPUs the copy of the non-local part of
the host memory force buffer to the force array was conditional on
the local instead of the non-local list size. This meant that with
an empty non-local list and non-empty local list outdated non-local
forces would be copied. Conversely, with an empty local list all
non-local forces would not be added. Both things can only happen
in systems with partially empty boxes and then only rarely.
Having the local kernel, D2H copyback and F reduction called
conditionally is not useful in practice, so they are now unconditional
to avoid complicating the code.
Fixes #1721.
Change-Id: I06731b0055a4fb5a16168e7180964e0b87443b0f