Enable StatePropagatorGpuData for force transfers
Force transfers have been switched to use StatePropagatorGpuData already
before. This change updates the synchronization mechanisms as:
- replaces the previous stream sync after GPU buffer/ops reduction with
a waitForcesReadyOnHost call;
- removes the barriers in copyForces[From|To]Gpu() as dependencies
are now satisfied: most dependencies are intra-stream and therefore
implicit, the exception being the halo exchange that uses its own
mechanism to sync H2D in the local stream with the nonlocal stream
(which is yet to be replaces Refs #3093).
Refs. #3126.
Change-Id: I8bfd39f79c87f20492c4ae287d6f19261724f806