I use C ++ binding for OpenCL and when installing one of my kernels I get cl::Error
, which says -38 (CL_INVALID_MEM_OBJECT)
for clEnqueueNDRangeKernel
.
This error is not listed as one of the possible clEnqueueNDRangeKernel errors. The notify function gives me the following output:
CL_INVALID_MEM_OBJECT error executing CL_COMMAND_NDRANGE_KERNEL on a GeForce GTX 560 (device 0).
I have yet to find a minimal example demonstrating this behavior.
What can cause such an error when calling this function?
Using google I just found this answer . It claims that I need to re setKernelArg
attached memory object if it has been updated. (At least this is my interpretation, and there is no detailed explanation of what the updated tools are.) However, I doubt that this is correct, although I cannot prove it. Maybe you know the official source?
Update
After some testing, I found that adding the __global const float*
parameter to the kernel introduced an error. I also found that an error only occurs every time if I clSetKernelArg
this new argument after another (already existing) argument. If I do this before another argument is set, it will work every second time. Of course, this is not an option, since I need to set the argument at any time.
Update 2
I noticed that going through code with debugging "re-introduces" an error in the version where I set a new argument before another. (This means the error is repeated every time.)
Could this be some kind of race condition? I do not use multithreading, but there are 7 threads in the debugger that can come from Qt or OpenCL.
Minimal working example
#include <CL/cl.hpp> #include <vector> #include <iostream> #define STRINGIFY(x) #x std::string kernel = STRINGIFY( __kernel void apply(__global const float *param1) { } ); template <class T> cl::Buffer genBuffer(const cl::Context &context, const std::vector<T> &data, cl_mem_flags flags = CL_MEM_READ_ONLY) { return cl::Buffer(context, flags | CL_MEM_COPY_HOST_PTR, data.size() * sizeof(data[0]), const_cast<T*>(&data[0])); } int main() { std::vector<cl::Platform> clPlatforms; cl::Platform::get(&clPlatforms); cl_context_properties props[] = { CL_CONTEXT_PLATFORM, (cl_context_properties)clPlatforms[0](), 0}; cl::Context clContext = cl::Context(CL_DEVICE_TYPE_GPU, props); std::vector<cl::Device> devices = clContext.getInfo<CL_CONTEXT_DEVICES>(); if(devices.empty()) { std::cerr << "No devices found!\n"; exit(-1); } cl::Device clDevice = devices[0]; cl::CommandQueue clQueue = cl::CommandQueue(clContext, clDevice, 0, 0); cl::Program program(clContext, cl::Program::Sources(1, std::make_pair(kernel.c_str(), kernel.size()))); program.build(devices); cl::Kernel kernel(program, "apply"); //this introduces the error kernel.setArg(0, genBuffer(clContext, std::vector<cl_float>(100)); //the error is triggered here clQueue.enqueueNDRangeKernel(kernel, cl::NullRange, cl::NDRange(100), cl::NullRange); }