I'm just starting to use the Julia CUDArt package to manage GPU computing. I am wondering how to make sure that if I proceed to extract data from gpu (for example, using to_host() ), I will not do this before all the necessary calculations are performed on it.
Through some experimentation, it seems that to_host(CudaArray) will lag while a specific CudaArray is being updated. So maybe just using this is enough for security? But this is a little strange.
I am currently using the launch() function to launch my kernels, as shown in the documentation.
The CUDArt documentation gives an example using the Julia @sync macro, which seems like it can be great. But for the purposes of @sync I did my βjobβ and am ready to move as soon as the kernel is launched using launch() , and not after it is completed. As far as I understand, the launch() operation - there is no way to change this function (for example, make it wait for the function that it launches to receive).
How can I do this kind of synchronization?
parallel-processing julia-lang julia-gpu
Michael ohlrogge
source share