Allow PME coordinate send before H2D coordinate transfer
The introduction of the GPU PME-PP communication functionality had the
side effect of delaying the PME coordinate send until after the H2D
coordinate transfer, even on the default code path. This patch allows
the PME transfer to occur in its original location when the send is
not originating from GPU memory. This is a lightweight solution,
without any new functionality, suitable for the release branch. (There
will be a more comprehensive change in the master branch which also
extends the GPU PME-PP communication functionality.)
Implements #3159
Change-Id: Ic30c154e04bb4c2846bbad3de603f879a71b9133