GLSL semaphores?

I already had a problem that I wanted to mix the color values ​​in the image unit by doing something like:

vec4 texelCol = imageLoad(myImage, myTexel); imageStore(myImage, myTexel, texelCol+newCol); 

In a scenario where several fragments can have the same meaning for 'myTexel', this is not possible because you cannot create atomicity between the imageLoad and imageStore commands, and other shaderinvocations can change the texel color between them.

Now someone told me that poeple work around this problem, creating semaphores using atomic commands on uint textures, so that the shader will somehow wait in a loop before accessing the texel, and as soon as it becomes free, atomically write itno integer texture to block other calls to the fragment shader, process the color texel and when it finishes atomically releasing the integer texel again.

But I can’t understand how this could work and what the code would look like?

Is this really possible? can the shader of a GLSL fragment be set to wait in a while loop? If possible, can someone give an example?

+10
concurrency opengl glsl
source share
1 answer

Essentially, you just implement a spinlock . Only instead of a single lock variable do you have a whole lock texture.

Logically, what you do makes sense. But as far as OpenGL is concerned, it really won't work.

See, the OpenGL shader execution model claims that calls are made in an order that is largely undefined with respect to each other. But spin-locks work only if there is a guarantee of moving forward between different threads. In essence, spinlocks require that a thread that is spinning cannot cause the actuator to not start the thread that it is expecting.

OpenGL does not provide such a guarantee. This means that one thread can completely block a pixel, and then stop execution (for some reason), while another thread blocks that pixel. A blocked thread never stops execution, and a thread that owns the lock never resumes execution.

How can this happen in a real system? Well, let's say you have a shader fragment invocation group that runs on some fragments from a triangle. They all block their pixels. But then they diverge in execution due to a conditional transition within the blocking area. A discrepancy in performance may mean that some of these calls are transferred to another execution unit. If at the moment there is not one available, then they actually stop until one becomes available.

Now, let's say that some other group for calling the fragment shader came and was assigned an executive unit in front of the diverging group. If this group tries to spinlock pixels from a diverging group, it essentially depletes the diverging runtime group, expecting it to never happen.

Obviously, real GPUs have more than one executive module, but you can imagine that with a large number of call groups it is quite possible that in such a scenario problems arise from time to time.

+8
source share

All Articles