You need to queue a barrier if you write a queue out of turn as one way to ensure dependency. You can also use cl_event objects to ensure proper ordering of commands on the command line.
If you write your code in such a way that you call clFinish after each kernel call, then using clEnqueueBarrier will not affect your code, since you already provide ordering.
The point to use clEnqueueBarrier would be a case like:
clEnqueueNDRangeKernel(queue, kernel1); clEnqueueBarrier(queue); clEnqueueNDRangeKernel(queue, kernel2);
In this case, kernel2 depends on the results of kernel1. If this queue is out of order, then without a barrier2 kernel, it can execute up to kernel1, causing incorrect behavior. You can achieve the same order:
clEnqueueNDRangeKernel(queue, kernel1); clFinish(queue); clEnqueueNDRangeKernel(queue, kernel2);
because clFinish will wait until the queue is empty (all kernel / data transfers completed). However, clFinish will wait for kernel1 to finish in this case, while clEnqueueBarrier should immediately return control back to the application (allowing you to insert more kernels or do other useful work.
As a side note, I think clFinish will implicitly call clFlush , so you wonβt need to call it every time.
KLee1 source share