Make the use of GpuEventSynchronizer in SYCL conformant with CUDA/OpenCL
MR !1035 refactored the use of GpuEventSynchronizer in CUDA and OpenCL
to make merging the code paths easier. Here, we update SYCL to the same
standard.
Note, that it introduces additional synchronization between local and
non-local queues. It is present in CUDA and OpenCL, but was implicit in
SYCL. To simplify code, it is added here. If it turns out to be
detrimental to performance, it can be (conditionally) NOPed.
Refs #2608, #3895.