Remove thread-MPI limitation for GPU PP Halo exchange