Image2d_t is corrupted when passed to the OpenCL kernel

I write a pathtracer in Haskell and OpenCL, and I had a problem passing image2d_t to my kernel to write output. Namely, calling any of the get_image_* functions in OpenCL on image2d_t returns meaningless values ​​(usually either 0 or 2 ^ 24-1), and write_imagef does nothing. This only happens when working on the GPU - the processor is operating normally. Calling clGetImageInfo on the host returns the correct values. Haskell bindings for OpenCL convert error codes to exceptions, so be sure to check for errors. The clinic reports my version as "OpenCL 1.2 AMD-APP (1084.2)." I should note that I experienced (and reported) several errors that caused the OpenCL compiler to execute segfault or not bind, so this may be the result of this, and not an error in my code.

I initialize OpenCL like this (hopefully it should be relatively clear to people who don't know Haskell):

 (platform:_) <- clGetPlatformIDs (device:_) <- clGetDeviceIDs platform CL_DEVICE_TYPE_GPU glContext <- glXGetCurrentContext glDisplay <- glXGetCurrentDisplay context <- clCreateContext [CL_GL_CONTEXT_KHR glContext, CL_GLX_DISPLAY_KHR glDisplay] [device] putStrLn queue <- clCreateCommandQueue context device [] source <- readFile "pt.cl" program <- clCreateProgramWithSource context source clBuildProgram program [device] "-cl-strict-aliasing" `catch` (Ξ»e -> case (e :: CLError) of CL_BUILD_PROGRAM_FAILURE -> putStrLn "Building OpenCL program failed:" >> clGetProgramBuildLog program device >>= putStrLn >> throw e _ -> return ()) kernel <- clCreateKernel program "sample" pCorners <- mallocArray 4 buffer <- clCreateBuffer context [CL_MEM_READ_ONLY, CL_MEM_USE_HOST_PTR] (4*sizeOf (undefined :: V.Vec4F), castPtr pCorners) clSetKernelArgSto kernel 1 buffer tex@ (TextureObject texid) <- head <$> (genObjectNames 1) activeTexture $= TextureUnit 0 textureBinding Texture2D $= Just tex textureFilter Texture2D $= ((Nearest, Nothing), Nearest) textureWrapMode Texture2D S $= (Repeated, Clamp) textureWrapMode Texture2D T $= (Repeated, Clamp) texImage2D Nothing NoProxy 0 RGBAβ€² (TextureSize2D initialWidth initialHeight) 0 (PixelData RGBA UnsignedByte nullPtr) image <- clCreateFromGLTexture2D context [CL_MEM_READ_WRITE] gl_TEXTURE_2D 0 texid clSetKernelArgSto kernel 2 image 

And I call it (a bit simplistic) to run the kernel and render the result:

 clSetKernelArgSto kernel 0 position pokeArray pCorners orientedCorners -- update the pCorners array finish -- This is glFinish() clEnqueueAcquireGLObjects queue [image] [] clEnqueueNDRangeKernel queue kernel [width, height] [] [] clEnqueueReleaseGLObjects queue [image] [] clFinish queue drawElements TriangleFan 4 UnsignedInt offset0 swapBuffers 

Finally, the test core:

 __kernel void sample(float3 position, __constant float3 corner[4], image2d_t output) { write_imagef(output, (int2)(get_global_id(0), get_global_id(1)), (float4)(0, 0.5f, 1, 1)); } 

The result is a full-screen quad displaying a random, uninitialized area of ​​the GPU's memory. It should be a full-screen blue dial. I had several get_image_* to display the results of get_image_* functions, but they started to cause the program to freeze.

+4
source share
2 answers

I had a similar problem. After reordering the kernel arguments, all image2d_t arguments are the first arguments, this worked. Specifically, calling get_image_dim returned the correct results. I do not know if this is a mistake. My GPU: ATI Radeon 7950.

0
source

There are rules in the OpenCL spec about this β€” an access specifier is required for the image2d_t object.

There are two such classifiers:

  • read_only (or __read_only )
  • write_only (or __write_only )

They are mutually exclusive and cannot be used together (so you cannot read and write texture at the same time - this is important if you intend to do accumulation work with your image, which, as I suspect, is a Monte Carlo application, such as path tracing). Indeed, to omit the qualifier, it will simply be read_only by default, but this, unfortunately, is the wrong choice for the output image.

The solution is to simply qualify your image argument with write_only , or if you also need to read it, use some kind of swap system (or use a global memory buffer that can be read and written to at the same time, but it does CL / GL interop is more complex, and you lose the ability to fetch ...).

The reason why it works on the processor, I guess, is because there is no read-only "texture memory" on the processor, so even if it is technically illegal to write to the image, it is possible, and the runtime allows you to do this. The GPU, on the other hand, has read-only memory partitions that cannot be written while the kernel is running no matter how hard you try (or perhaps the runtime for your GPU device is more stringent).

* When I say runtime, I mean a device with OpenCL support, not your program, of course.

+1
source

All Articles