Remove excessive H2D and D2H copies of velocities when update is offloaded
If update is offloaded:
The H2D copy of velocities is done:
1. At the search step after the device buffers were reinitialized.
The D2H copy is done:
1. In the beginning of the step on search steps (to back up before the
device buffers are reinitialized).
2. In the beginning of the velocity output step.
3. After update when globals are computed.
4. After update when temperature is needed for the next step.
The Local locality is used for the copies when update is offloaded in
anticipation of the multi GPU case.
The REMD simulations are now not supported when update is offloaded.
Change-Id: Ifbb9636cafba8980a4a781d942420c5c2c1bcdfd