Move pmalloc(..)/pfree(...) to separate source files in CUDA/OpenCL/SYCL