OpenCL Kernel Deadlines

Question

OpenCL Kernel Deadlines

Is this the right way to synchronize kernel runtime for OpenCL? I am very interested in using the C ++ shell (unfortunately, there are not many examples of timings).

cl::CommandQueue queue(context, device, CL_QUEUE_PROFILING_ENABLE, &err); checkErr(err, "Cannot create the command queue"); /* Warm-up */ for (unsigned i = 0; i < NUMBER_OF_ITERATIONS; ++i) { err = queue.enqueueNDRangeKernel(kernel, cl::NullRange, cl::NDRange(512), cl::NullRange, NULL, NULL); checkErr(err, "Cannot enqueue the kernel"); } queue.finish(); /* Time kernels */ cl::Event start, stop; queue.enqueueMarker(&start); for (unsigned i = 0; i < NUMBER_OF_ITERATIONS; ++i) { err = queue.enqueueNDRangeKernel(kernel, cl::NullRange, cl::NDRange(512), cl::NullRange, NULL, NULL); checkErr(err, "Cannot enqueue the kernel"); } queue.enqueueMarker(&stop); stop.wait(); cl_ulong time_start, time_end; double total_time; start.getProfilingInfo(CL_PROFILING_COMMAND_END, &time_start); stop.getProfilingInfo(CL_PROFILING_COMMAND_START, &time_end); total_time = time_end - time_start; /* Results */ cout << "Execution time in milliseconds " << total_time / (float)10e6 / NUMBER_OF_ITERATIONS << endl;

+2

c ++ profiling opencl

user1096294 Apr 14 '14 at 21:38

source share

1 answer

Tim · Answer 1 · 2014-04-15T03:41:43+0000

I think your approach should work fine (wrong). Alternatively, if you want to split each call, you can pass the event to enqueueNDRangeKernel and call getProfilingInfo on that enqueueNDRangeKernel .

 cl::Event evt; err = queue.enqueueNDRangeKernel(kernel, cl::NullRange, cl::NDRange(512), cl::NullRange, NULL, &evt); evt.wait(); elapsed += evt.getProfilingInfo<CL_PROFILING_COMMAND_END>() - evt.getProfilingInfo<CL_PROFILING_COMMAND_START>();

OpenCL Kernel Deadlines

More articles: